Test post
To create a Spark DataFrame from a nested List,
import org.apache.spark.sql._
val spark = SparkSession.builder().master("local").getOrCreate()
import spark.implicits._
val values = List(List("A", "3", "3", "4", "3", "3"),
List("B", "2", null, "4", "3", "2"),
List("B", "1", "3", "5", "1", "5"),
List("C", "2", "4", null, "3", "3"),
List("C", "3", null, "4", null, "1")).
map(x => (x(0), x(1), x(2), x(3), x(4), x(5)))
val df = values.toDF("name", "v1", "v2", "v3", "v4", "v5")
df.show
name | v1 | v2 | v3 | v4 | v5 |
---|---|---|---|---|---|
A | 3 | 3 | 4 | 3 | 3 |
B | 2 | null | 4 | 3 | 2 |
B | 1 | 3 | 5 | 1 | 5 |
C | 2 | 4 | null | 3 | 3 |
C | 3 | null | 4 | null | 1 |
But for example the following fails
val values = List(List("A", "3", "3", "4", "3", 3),
List("B", "2", null, "4", "3", 2),
List("B", "1", "3", "5", "1", 5),
List("C", "2", "4", null, "3", 3),
List("C", "3", null, "4", null, 1)).
map(x => (x(0), x(1), x(2), x(3), x(4), x(5)))
val df = values.toDF("name", "v1", "v2", "v3", "v4", "v5")