0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

pyspark write csv turn off scientific notation

Posted at

When I write spark DataFrame to CSV file in databricks,
I found the number column show like 2.6220427383E10,
the scientific notation format.

I searched in stackoverflow and found a lot of solutions,
such as DecimalType(18, 0) etc.

But in my case,
I found the reason is that the column is divided by 100,
like

(df_sale['TOTAL_PRICE'].cast('integer')/100).alias('TOTAL_PRICE')

Although when I display(df_sale) it shows like an integer value,
But when I write it to csv file, it become to 2.6220427383E10.

So the solution is cast('integer') one more time like this:

(df_sale['TOTAL_PRICE'].cast('integer')/100).cast('integer').alias('TOTAL_PRICE')

And then, everything is going well.

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?