java的怎么操作spark的dataframe 如何打印 spark dataframe
java的怎么操作spark的dataframe
t java.util.Properties
import org.apache.log4j.Logger
import org.apache.spark.SparkConf
import org.apache.spark.api.java.JavaSparkContext
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SaveMode
public class Demo_Mysql3 {
private static Logger logger = Logger.getLogger(Demo_Mysql2.class)
public static void main(String[] args) {
如何打印 spark dataframe
打印DataFrame里面的模式
在创建完DataFrame之后,我们一般都会查看里面数据的模式,我们可以通过printSchema函数来查看。它会打印出列的名称和类型:
students.printSchema
root
|--
id:
string (nullable =
true)
|--
studentName:
string (nullable =
true)
|--
phone:
string (nullable =
true)
|--
email:
string (nullable =
true)
如果采用的是load方式参见DataFrame的,students.printSchema的输出则如下:
root
|--
id|studentName|phone|email:
string (nullable =
true)
如何将spark dataframe存到mongodb
import scala.collection.mutable.ArrayBuffer import scala.io.Source import java.io.PrintWriter import util.control.Breaks._ import org.apache.spark.SparkContext import org