Skip to content

Instantly share code, notes, and snippets.

@longshilin
Last active January 20, 2018 05:02
Show Gist options
  • Save longshilin/927224bb71fdde44a8d06000d65575b4 to your computer and use it in GitHub Desktop.
Save longshilin/927224bb71fdde44a8d06000d65575b4 to your computer and use it in GitHub Desktop.
spark-wordcount | programme in scala | total: 1file
package spark
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object wordcount extends App{
val logFile = "hdfs://hadoop:9000/README.md"
val conf = new SparkConf().setAppName("wordcount").setMaster("local")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment