Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
258 views
in Technique[技术] by (71.8m points)

python - Import csv with inconsistent count of columns per row with original header use pandas

please how can I read csv of that type and keep original columns names? Maybe add some generic column names to the end of the header, depending on the max number of columns in the body of csv...

a,b,c
1,2,3
1,2,3,
1,2,3,4

Simple read_csv does not work:

tempfile = pd.read_csv(path 
                 ,index_col=None
                 ,sep=','
                 ,header=0
                 ,error_bad_lines=False
                 ,encoding = 'unicode_escape'
                 ,warn_bad_lines=True
                 )
b'Skipping line 3: expected 3 fields, saw 4
Skipping line 4: expected 3 fields, saw 4
'

I need that type of result:

a,b,c,x1
1,2,3,NA
1,2,3,NA
1,2,3,4
question from:https://stackoverflow.com/questions/65911172/import-csv-with-inconsistent-count-of-columns-per-row-with-original-header-use-p

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

One approach would be to first read just the header row in and then pass these column names with your extra generic names as a parameter to pandas. For example:

import pandas as pd
import csv

filename = "input.csv"

with open(filename, newline="") as f_input:
    header = next(csv.reader(f_input))

header += [f'x{n}' for n in range(1, 10)]

tempfile = pd.read_csv(filename,
                 index_col=None,
                 sep=',',
                 skiprows=1,
                 names=header,
                 error_bad_lines=False,
                 encoding='unicode_escape',
                 warn_bad_lines=True,
                 )

skiprows=1 tells pandas to jump over the header and names holds the full list of column headers to use.

The header would then contain:

['a', 'b', 'c', 'x1', 'x2', 'x3', 'x4', 'x5', 'x6', 'x7', 'x8', 'x9']

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...