抓着英语老师的两只兔子,美女扒开内裤羞羞网站,日本经典片免费看

1/列出mysql數據庫中的所有數據庫sqoop list-databases -connect jdbc:mysql://localhost:3306/ -username root -password 1234562/連接mysql并? 列出test數據庫中的表sqoop list-tables -connect jdbc:mysql://localhost:3306/test -username root -password 1234563/將關系型數據的表結構復制到hive中，只是復制表結構? 內容不復制sqoop create-hive-table -connect jdbc:mysql://localhost:3306/test -table sqoop_testTabinMySql -username root -password 123456 -hive-table testNewTabInHive4/從關系數據庫導入文件到hive中sqoop import -connect jdbc:mysql://localhost:3306/zxtest -username root -password 123456 -table sqoop_test -hive-import -hive-table s_test -m 15/將hive中的表數據導入到mysql 中，在進行導入前 mysql中的表hive_test 必須提前創建好sqoop export -connect jdbc:mysql://localhost:3306/zxtest -username root -password root -table hive_test -export-dir /user/hive/warehouse/new_test_partition/dt=2012-03-056/從數據庫導出表的數據到HDFS上的文件sqoop import -connect jdbc:mysql://localhost:3306/compression -username=hadoop -password=123456 -table HADOOP_USER_INFO -m 1 -target -dir /user/test7/數據庫增量導入表數據到hdfs中sqoop import -connect jdbc:mysql://localhost:3306/compression -username=hadoop -password=123456 -table HADOOP_USER_INFO -m 1 -target -dir /user/test -check -column id -incremental append -last-value 3Importsqoop 數據導入具有以下特點：1.支持文本文件(--as-textfile)、avro(--as-avrodatafile)、SequenceFiles(--as-sequencefile)。 RCFILE暫未支持，默認為文本2.支持數據追加，通過--apend指定3.支持table列選取（--column），支持數據選取（--where），和--table一起使用4.支持數據選取，例如讀入多表join后的數據'SELECT a.*, b.* FROM a JOIN b on (a.id == b.id) ‘，不可以和--table同時使用5.支持map數定制(-m)6.支持壓縮(--compress)7.支持將關系數據庫中的數據導入到Hive(--hive-import)、HBase(--hbase-table)? 數據導入Hive分三步：1）導入數據到HDFS? 2）Hive建表? 3）使用“LOAD DATA INPAHT”將數據LOAD到表中? 數據導入HBase分二部：1）導入數據到HDFS 2）調用HBase put操作逐行將數據寫入表*import是將關系數據庫遷移到HDFS上? 默認目錄是/user/${user.name}/${tablename}，可以通過--target-dir設置hdfs上的目標目錄。export是import的反向過程，將hdfs上的數據導入到關系數據庫中? 由于sqoop是通過map完成數據的導入，各個map過程是獨立的，沒有事物的概念，可能會有部分map數據導入失敗的情況。為了解決這一問題，sqoop中有一個折中的辦法，即是指定中間 staging表，成功后再由中間表導入到結果表。這一功能是通過 --staging-table指定，同時staging表結構也是需要提前創建出來的:sqoop export --connect jdbc:mysql://192.168.81.176/sqoop --username root -password passwd --table sds --export-dir /user/guojian/sds --staging-table sds_tmp需要說明的是，在使用 --direct， --update-key或者--call存儲過程的選項時，staging中間表是不可用的。create-hive-table將關系數據庫表導入到hive表中參數說明–hive-homeHive的安裝目錄，可以通過該參數覆蓋掉默認的hive目錄–hive-overwrite覆蓋掉在hive表中已經存在的數據–create-hive-table默認是false,如果目標表已經存在了，那么創建任務會失敗–hive-table后面接要創建的hive表–table指定關系數據庫表名sqoop create-hive-table --connect jdbc:mysql://192.168.81.176/sqoop --username root -password passwd --table sds --hive-table sds_bak默認sds_bak是在default數據庫的。這一步需要依賴HCatalog，需要先安裝HCatalog，否則報如下錯誤：Hive history file=/tmp/guojian/hive_job_log_cfbe2de9-a358-4130-945c-b97c0add649d_1628102887.txtFAILED: ParseException line 1:44 mismatched input ')' expecting Identifier near '(' in column specificationmetastore 配置sqoop job的共享元數據信息，這樣多個用戶定義和執行sqoop job在這一 metastore中。默認存儲在~/.sqoop啟動：sqoop metastore關閉：sqoop metastore --shutdownmetastore文件的存儲位置是在 conf/sqoop-site.xml中 sqoop.metastore.server.location 配置，指向本地文件。metastore可以通過TCP/IP訪問，端口號可以通過 sqoop.metastore.server.port配置，默認是16000。客戶端可以通過指定 sqoop.metastore.client.autoconnect.url或使用 --meta-connect，配置為 jdbc:hsqldb:hsql://:/sqoop，例如 jdbc:hsqldb:hsql://metaserver.example.com:16000/sqoop。Sqoop will read entire content of the password file and use it as a password. This will include any trailing white space characters such as new line characters that are added by default by most of the text editors. You need to make sure that your password file contains only characters that belongs to your password. On the command line you can use command echo with switch -n to store password without any trailing white space characters. For example to store password secret you would call echo -n "secret" > password.file.Sqoop automatically supports several databases, including MySQL. Connect strings beginning with jdbc:mysql:// are handled automatically in Sqoop. (A full list of databases with built-in support is provided in the "Supported Databases" section. For some, you may need to install the JDBC driver yourself.)You can use Sqoop with any other JDBC-compliant database. First, download the appropriate JDBC driver for the type of database you want to import, and install the .jar file in the $SQOOP_HOME/lib directory on your client machine. (This will be /usr/lib/sqoop/lib if you installed from an RPM or Debian package.) Each driver .jar file also has a specific driver class which defines the entry-point to the driver. For example, MySQL’s Connector/J library has a driver class of com.mysql.jdbc.Driver. Refer to your database vendor-specific documentation to determine the main driver class. This class must be provided as an argument to Sqoop with --driver.For example, to connect to a SQLServer database, first download the driver from microsoft.com and install it in your Sqoop lib path.Sqoop can also import the result set of an arbitrary SQL query. Instead of using the --table, --columns and --where arguments, you can specify a SQL statement with the --query argument.When importing a free-form query, you must specify a destination directory with --target-dir.NoteIf you are issuing the query wrapped with double quotes ("), you will have to use \$CONDITIONS instead of just $CONDITIONS to disallow your shell from treating it as a shell variable. For example, a double quoted query may look like: "SELECT * FROM x WHERE a='foo' AND \$CONDITIONS"The facility of using free-form query in the current version of Sqoop is limited to simple queries where there are no ambiguous projections and no OR conditions in the WHERE clause. Use of complex queries such as queries that have sub-queries or joins leading to ambiguous projections can lead to unexpected results.? 即? where中不能有orSqoop imports data in parallel from most database sources. You can specify the number of map tasks (parallel processes) to use to perform the import by using the -m or --num-mappers argument. Each of these arguments takes an integer value which corresponds to the degree of parallelism to employ. By default, four tasks are used. Some databases may see improved performance by increasing this value to 8 or 16. 默認開啟4個taskWhen performing parallel imports, Sqoop needs a criterion by which it can split the workload. Sqoop uses a splitting column to split the workload. By default, Sqoop will identify the primary key column (if present) in a table and use it as the splitting column. The low and high values for the splitting column are retrieved from the database, and the map tasks operate on evenly-sized components of the total range. For example, if you had a table with a primary key column of id whose minimum value was 0 and maximum value was 1000, and Sqoop was directed to use 4 tasks, Sqoop would run four processes which each execute SQL statements of the form SELECT * FROM sometable WHERE id >= lo AND id < hi, with (lo, hi) set to (0, 250), (250, 500), (500, 750), and (750, 1001) in the different tasks.Sqoop cannot currently split on multi-column indices. If your table has no index column, or has a multi-column key, then you must also manually choose a splitting column.If a table does not have a primary key defined and the --split-by? is not provided, then import will fail unless the number of mappers is explicitly set to one with the --num-mappers 1 option or the --autoreset-to-one-mapper option is used. The option --autoreset-to-one-mapper is typically used with the import-all-tables tool to automatically handle tables without a primary key in a schema.Sqoop will copy the jars in $SQOOP_HOME/lib folder to job cache every time when start a Sqoop job. When launched by Oozie this is unnecessary since Oozie use its own Sqoop share lib which keeps Sqoop dependencies in the distributed cache. Oozie will do the localization on each worker node for the Sqoop dependencies only once during the first Sqoop job and reuse the jars on worker node for subsquencial jobs. Using option --skip-dist-cache in Sqoop command when launched by Oozie will skip the step which Sqoop copies its dependencies to job cache and save massive I/O.MySQL provides the mysqldump tool which can export data from MySQL to other systems very quickly. By supplying the --direct argument, you are specifying that Sqoop should attempt the direct import channel. This channel may be higher performance than using JDBC.By default, Sqoop will import a table named foo to a directory named foo inside your home directory in HDFS. For example, if your username is someuser, then the import tool will write to /user/someuser/foo/(files). You can adjust the parent directory of the import with the --warehouse-dir argument. For example:$ sqoop import --connnect--table foo --warehouse-dir /shared \When using direct mode, you can specify additional arguments which should be passed to the underlying tool. If the argument -- is given on the command-line, then subsequent arguments are sent directly to the underlying tool. For example, the following adjusts the character set used by mysqldump:$ sqoop import --connect jdbc:mysql://server.foo.com/db --table bar \? ? --direct -- --default-character-set=latin1By default, imports go to a new target location. If the destination directory already exists in HDFS, Sqoop will refuse to import and overwrite that directory’s contents. If you use the --append argument, Sqoop will import data to a temporary directory and then rename the files into the normal target directory in a manner that does not conflict with existing filenames in that directory.Sqoop is preconfigured to map most SQL types to appropriate Java or Hive representatives. However the default mapping might not be suitable for everyone and might be overridden by --map-column-java (for changing mapping to Java) or --map-column-hive (for changing Hive mapping).Sqoop is expecting comma separated list of mapping in form=. For example:

$ sqoop import ... --map-column-java id=String,value=Integer

Sqoop will rise exception in case that some configured mapping will not be used.

You should specify append mode when importing a table where new rows are continually being added with increasing row id values. You specify the column containing the row’s id with --check-column. Sqoop imports rows where the check column has a value greater than the one specified with --last-value.

At the end of an incremental import, the value which should be specified as --last-value for a subsequent import is printed to the screen. When running a subsequent import, you should specify --last-value in this way to ensure you import only the new or updated data.

You can import data in one of two file formats: delimited text or SequenceFiles.

Delimited text is appropriate for most non-binary data types. It also readily supports further manipulation by other tools, such as Hive.

reading from SequenceFiles is higher-performance than reading from text files, as records do not need to be parsed

By default, data is not compressed. You can compress your data by using the deflate (gzip) algorithm with the -z or --compress argument, or specify any Hadoop compression codec using the --compression-codec argument. This applies to SequenceFile, text, and Avro files.

While the choice of delimiters is most important for a text-mode import, it is still relevant if you import to SequenceFiles with --as-sequencefile. The generated class' toString() method will use the delimiters you specify, so subsequent formatting of the output data will rely on the delimiters you choose.

When Sqoop imports data to HDFS, it generates a Java class which can reinterpret the text files that it creates when doing a delimited-format import.

When Sqoop imports data to HDFS, it generates a Java class which can reinterpret the text files that it creates when doing a delimited-format import. The delimiters are chosen with arguments such as --fields-terminated-by; this controls both how the data is written to disk, and how the generated parse() method reinterprets this data. The delimiters used by the parse() method can be chosen independently of the output arguments, by using --input-fields-terminated-by, and so on. This is useful, for example, to generate classes which can parse records created with one set of delimiters, and emit the records to a different set of files using a separate set of delimiters.

2016年9月2日? 閱讀筆記

Hive can put data into partitions for more efficient query performance. You can tell a Sqoop job to import data for Hive into a particular partition by specifying the --hive-partition-key and --hive-partition-value arguments. The partition value must be a string. Please see the Hive documentation for more details on partitioning.

7.3. Example Invocations

The following examples illustrate how to use the import tool in a variety of situations.

A basic import of a table named EMPLOYEES in the corp database:

$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES

A basic import requiring a login: