Skip to content

Instantly share code, notes, and snippets.

@pranaysahith
Last active June 3, 2017 08:58
Show Gist options
  • Save pranaysahith/7aba79721ac8af78b726fbe18f894cac to your computer and use it in GitHub Desktop.
Save pranaysahith/7aba79721ac8af78b726fbe18f894cac to your computer and use it in GitHub Desktop.
1. Import orders from table retail_db.orders from order id 1 to 30,000 into hdfs folder /user/cloudera/problem1 using sqoop.
fields should be tab delimited and store data in sequence file format. Use compression codec as gzip. result should have 30,000 records.
Sol:- sqoop import --connect jdbc:mysql://quickstart.cloudera/retail_db --username retail_dba --password cloudera --table orders --where 'order_id <= 30000' --target-dir /user/cloudera/problem1 --compress --as-sequencefile --fields-terminated-by \t
2. Export data from hdfs folder /user/cloudera/problem2 to table retail_db.orders1. data in hdfs is compressed using gzip. fields
are delimited by ','.
setup prob 2 -
mysql -u retail_dba -pcloudera;
use retail_db;
create table orders1 like orders;
3.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment