Technology Sharing

Getting started with dataX

2024-07-12

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Download dataX

https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202308/datax.tar.gz

Then

After downloading, unzip it to a local directory, enter the bin directory, and run the synchronization job:

$ cd  {YOUR_DATAX_HOME}/bin
$ python datax.py {YOUR_JOB.json}

You need python, jdk1.8 and maven3

Step 1: Create a job configuration file (json format)

Template Type:

#stream2stream.json
{
  "job": {
    "content": [
      {
        "reader": {
          "name": "streamreader",
          "parameter": {
            "sliceRecordCount": 10,
            "column": [
              {
                "type": "long",
                "value": "10"
              },
              {
                "type": "string",
"value": "hello, world-DataX"
              }
            ]
          }
        },
        "writer": {
          "name": "streamwriter",
          "parameter": {
            "encoding": "UTF-8",
            "print": true
          }
        }
      }
    ],
    "setting": {
      "speed": {
        "channel": 5
       }
    }
  }
}

start up

$ cd {YOUR_DATAX_DIR_BIN}
$ python datax.py ./stream2stream.json 

On the left side of GitHub, which reader or writer do you want to use?

Just go to the current resouece and use the json he gave you.

If you can't open GitHub, it doesn't matter. There are templates in the plugins in the folder you downloaded.

very simple.

example

mysql read and write examples

  1. {"job": {"content": [{"reader": {"name": "mysqlreader", "parameter": {"username": "root","password": "123123","column": ["*"],"splitPk": "ID","where": "ID <= 1888","connection": [{"jdbcUrl": ["jdbc:mysql://192.168.1.1:3306/xxx?useUnicode=true&characterEncoding=utf8"], "table": ["t_member"]}]}}, "writer": {"name": "mysqlwriter", "parameter": {"column": ["*"], "connection": [{"jdbcUrl": "jdbc:mysql://192.168.1.2:3306/xxx?useUnicode=true&characterEncoding=utf8","table": ["t_xxx"]}], "password": "123123","preSql": ["执行写入前执行的语句,比如删除表啊,之类的"], "session": ["set session sql_mode='ANSI'"], "username": "root", "writeMode": "insert"}}}], "setting": {"speed": {"channel": "5"}}}
  2. }