본문 바로가기

[Spark]

[Spark] pyspark datafrmae to csv

728x90
반응형

pyspark.pandas.dataframe.to_csv

DataFrame.to_csv(path: Optional[str] = None, sep: str = ',', na_rep: str = '', columns: Optional[List[Union[Any, Tuple[Any, …]]]] = None, header: bool = True, quotechar: str = '"', date_format: Optional[str] = None, escapechar: Optional[str] = None, num_files: Optional[int] = None, mode: str = 'w', partition_cols: Union[str, List[str], None] = None, index_col: Union[str, List[str], None] = None, **options: Any) → Optional[str]

Write object to a comma-separated values (csv) file.

 

parameters

path: str, default None

       File path. If None is provided the result is returned as a string.

sep: str, default ‘,’

       String of length 1. Field delimiter for the output file.

na_rep: str, default ‘’

       Missing data representation.

columns: sequence, optional

       Columns to write.

header: bool or list of str, default True

       Write out the column names. If a list of strings is given it is assumed to be aliases for the column names.

quotechar: str, default ‘”’

       String of length 1. Character used to quote fields.

date_format: str, default None

       Format string for datetime objects.

escapechar: str, default None

       String of length 1. Character used to escape sep and quotechar when appropriate.

num_files: the number of partitions to be written in `path` directory when

       this is a path. This is deprecated. Use DataFrame.spark.repartition instead.

mode: str

       Python write mode, default ‘w’.

 


Reference

반응형

'[Spark]' 카테고리의 다른 글

[Spark] Pyspark - substring으로 문자열 자르기  (0) 2023.07.12