I'm parsing some HTML and I need to get all the innerHTML of <body />
. I'm doing it this way:
TFHpple *doc = [[TFHpple alloc] initWithHTMLData:[NSData dataWithContentsOfFile:sectionFilePath]];
TFHppleElement *body = [doc searchWithXPathQuery:@"//body"][0];
NSString *bodyHTML = body.raw;
However this returns:
<body>stuff inside body</body>
instead of just:
stuff inside body
Question: Is there any way to get the purely the inner HTML of an element, excluding its own tags?
I came up with this method, but I feel like I'm reinventing the wheel here. This method is also quite slow.
TFHppleElement *child;
for(int i = 0; i<body.children.count; i++){
child = (TFHppleElement*)body.children[i];
if(child.raw != nil) [bodyHTML appendString:child.raw];
else if(child.content != nil) [bodyHTML appendString:child.content];
}
Try this...
NSURL *url = [NSURL URLWithString: URL_HERE];
NSData *htmlData = [NSData dataWithContentsOfURL:url];
TFHpple *parser = [TFHpple hppleWithHTMLData:htmlData];
NSString *xpathQueryString = @"//body";
NSArray *nodes = [parser searchWithXPathQuery:xpathQueryString];
for (TFHppleElement *element in nodes) {
lable.text = [[element firstChild] content];
}