Technology sharing

Summarium notitiarum specierum ac formarum repono in Hive

2024-07-12

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

1.Data genus

Alvearia sustinet multiplex notitia genera, in duo genera divisa: primitiva notitia typi et genera notitiarum multiplicium. Haec sunt genera data alveare suffulta:

Primitivae notitiae typi:

1. Integer type:

                tinyint: I-byte integer signati
               smallint: II-byte integer signati
               int: IV-byte signati integer
                bigint: VIII-byte signati integer
               natare; IV-byte una cura fluctuetur numerus
               duplum; VIII-byte duplex cura fluctuetur numerus
                decimales: Summus praecisio numerus generis, praecisio et scala specificari possunt, sicut punctum (10,2).

Byte: Una ex principalibus unitates in computers repono, 1 byte occupat 8 frena, notitia range: range negativa: -128 ad -1, range positivum: 0 ad 127

2. Stabat genus:

                chorda; variae longitudinis filum
               varchar: Fila longitudo variabilis cum maximo termino longitudinis, ut varchar(255).
               char: Fila fixa longitudinis, ut char(10).

3.Date / tempus genus:

                indicatione: Indicatione temporis continens diem et tempus, accurate ad nanoseconds
               date: Pars tantum diem continet, non tempus partem
                intervallum; Tempus intervallum repraesentabat differentiam duorum temporum vel temporum

4.Boolean genus:

                Boolean: valor Boolean, valor verus vel falsus

5. genus binarii:

                binarii; Byte ordinata de arbitraria longitudinem

Data genera complexu:
1.Array genus

        ordinata<T> : Elenchus ordinatus continens multa elementa eiusdem generis, qualia sunt ordinata<int>

2. genus Mapping

    map<K, V> : Collectio inordinata clavium parium pretii, ubi clavis et valor notitiarum quaevis forma esse potest, ut tabula<string, int>


    3. structura genus

         struct<col1: type1, col2: type2, ...> : Tabulae multarum agrorum continentur, quisque campus diversi generis notitiae esse possunt, exempli causa.struct<name: string, age: int>

  1. CREATE TABLE example_table (
  2. tinyint_col tinyint,
  3. smallint_col smallint,
  4. int_col int,
  5. bigint_col bigint,
  6. float_col float,
  7. double_col double,
  8. decimal_col decimal(10, 2),
  9. string_col string,
  10. varchar_col varchar(255),
  11. char_col char(10),
  12. timestamp_col timestamp,
  13. date_col date,
  14. boolean_col boolean,
  15. binary_col binary,
  16. array_col array<int>,
  17. map_col map<string, int>,
  18. struct_col struct<name: string, age: int>,
  19. union_col uniontype<int, string>
  20. );

2.Hive file repono forma

alvearia formarum repositionis in duo genera dividuntur;

Genus texti simplicis fasciculi: textile, quod non comprimitur et etiam defalta forma repositionis alvearia est.

Unum genus fasciculi binarii est repono:

sequencefile: comprimatur, et data methodo oneris utendo onerari non possit.

orcfile: comprimi et data non possunt oneris methodo uti.

parquet: comprimi et notitia oneris methodo utens onerari non potest.

rcfile: Comprimere et onerare notitias non potest utendo methodo oneris.

Formae repositae textili et sequentiarum tabularum secundum ordinem repositionis innituntur;

Cum mensam creando, condito ut parquet uti potes ad formam repositam mensae denotandam, exempli gratia:

  1. create table if not exists stocks_parquet (
  2. track_time string,
  3. url string,
  4. session_id string,
  5. referer string,
  6. ip string,
  7. end_user_id string,
  8. city_id string
  9. )
  10. stored as parquet;

Mutare alvum scriptor default repono forma:

  1. <property>
  2. <name>hive.default.fileformat</name>
  3. <value>TextFile</value>
  4. <description>
  5. Expects one of [textfile, sequencefile, rcfile, orc].
  6. Default file format for CREATE TABLE statement. Users can explicitly override it by CREATE TABLE ... STORED AS [FORMAT]
  7. </description>
  8. </property>
  9. 也可以使用set方式修改:
  10. set hive.default.fileformat=TextFile