import pandas as pd
names_df = pd.read_csv("name.basics_sample_500.tsv", sep="\t")
names_df.head(n=3)
# names_df.sample(n=3)
Row number(s) to use as the column names, and the start of the data, or None
# yes, we did not want that, in our data:
names_df = pd.read_csv("name.basics_sample_500.tsv", sep="\t", header=1)
names_df.head(n=3)
List of column names to use. If file contains no header row, then you should explicitly pass header=None
names_df = pd.read_csv("name.basics_sample_500.tsv", sep="\t",
header=0,
names=["nID", "ArtistName", "birth", "death", "Profession", "mouvies"])
names_df.head(n=3)
Number of rows of file to read. Useful for reading pieces of large files.
Other useful parameters are chunksize and iterator
names_10_df = pd.read_csv("name.basics_sample_500.tsv", sep="\t", nrows=10)
names_10_df
Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values
names_10_df = pd.read_csv("name.basics_sample_500.tsv", sep="\t", nrows=10, na_values=["\\N"])
names_10_df
# note that rename create new DF
names_10_df.rename(columns={"primaryName": "ActorName"})