Hive supports several file formats: Text File SequenceFile RCFile Avro Files ORC Files Parquet Custom INPUTFORMAT and OUTPUTFORMAT The hive.default.fileformat configuration parameter which is avaialble in hive-site.xml determines the format to use if it is not specified in a CREATE TABLE or ALTER TABLE statement. Text file is the parameter's default value. What is File Format ? File Format is a way in which information is stored or encoded in a computer file. In Hive it refers to how records are stored inside the file. These file formats mainly vary between data encoding, compression rate, usage of space and disk I/O. Hive does not verify whether the data that you are loading matches the schema for the table or not. However, it verifies if the file format matches the table definition or not. Text File TEXT FILE format is a famous input/output format used in Hadoop. In Hive if we define a table as TEXTFILE it can load data of from csv (Comma Separated Values), tsv, txt
Comments
Post a Comment