How to get the structure of JSON in scala

advertisements

I have a lot of JSON files which are not structured and I want to get a deeper element and all the element to get to it.

For example :

{
"menu": {
    "id": "file",
    "popup": {
        "menuitem": {
                  "module"{
                      "-vdsr": "New",
                      "-sdst": "Open",
                      "-mpoi": "Close" }
        ...
    }
}

In this case the result would be :

menu.popup.menuitem.module.-vdsr
menu.popup.menuitem.module.-sdst
menu.popup.menuitem.module.-mpoi

I tried Jackson and Json4s and they are efficient to go the last value but, I don't see how I can get the whole structure.

I want this to run a job with apache spark on very huge JSON files and the structure will be very complex for each. I also tried sparkSQL but if I don't know the entire structure I can't get it.


What you're asking to do is essentially a tree traversal of an object, where JSON objects are considered nodes with named branches and other JSON types are considered leaves. There are many ways to do this. You might consider making a recursive function that explores the entire tree. Here is an example that works in PlayJson, but it shouldn't be very different in other libraries:

import play.api.libs.json._
def unfold(json: JsValue): Seq[String] = json match {
    case JsObject(kvps) => kvps.flatMap {
        case (key, value) => unfold(value).map(path => s"$key.$path")
    }
    case _ => Seq("")
}