You can use python to craft complex queries.

Notice the list comprehension. This sort of query would be extremely hard to write in SQL.

Lets break down what this list comprehension is doing.

*[F.avg(c) for c in df.columns if c != 'class']

df.columns is a list of columns (strings).

Filter out the column "class".

Take the average of each column.

The star expands the list into agg.

Beware of wrapping python loops around your spark code.