Microdata vs JSON-LD: Which Structured Data Format Wins? Artwork

Siegfried, deploy!

Helping you develop fast websites that scale. We're Steffen & Dominik, developers, friends and agency owners since 2011. Join us for hands-on tools and tactics to build and maintain large scale WordPress websites.

All Episodes

Siegfried, deploy!

Microdata vs JSON-LD: Which Structured Data Format Wins?

August 31, 2023 • Bleech

Are you wrestling to decide which data format is superior, Microdata or JSON-LD? Take a seat and let us take you on a journey exploring the pros and cons of these two formats in relation to websites.

Highlights
00:00 Introduction to Structured Data Formats
00:44 Importance of Structured Data and Schema.org
01:45 Implementation of JSON-LD
02:56 Implementation of Microdata
04:01 Multiple JSON-LD Script Tags
05:12 When to Use Which Format
08:30 Ease of Implementation and Debugging

Links
- Structured Data Schemas: https://schema.org/
- Structured Data Validator: https://validator.schema.org/
- Rich Results Tests: https://search.google.com/test/rich-results

Steffen: 0:00

Hey Dominik, oh Steffen, what's the better structure, data format, microdata or JSON-LD? What do you?

Dominik: 0:11

think.

Steffen: 0:13

I don't know. And what about this other thing, RFD?

Dominik: 0:18

A RDFA.

Steffen: 0:21

Why is that one out of the question?

Dominik: 0:24

already. Well, I guess, because it's really similar to Microdata and it has basically the same advantages and disadvantages unless you go really into depth, probably, and there might be reasons why to use the one over the other. But you know the answer, stefan, because it depends, right.

Dominik: 0:44

So it always depends on what you want to do, and I mean Microdata and JSON-LD are both forms of integrating structured data or like schemaorg specifically that's what we usually talk about data into an existing website, right, so that search engines and other programmatic tools can use the information that you provide on your website in a better way.

Dominik: 1:07

On the one hand, it's important for normal Google, but there are other search engines or like content aggregators, kind of like also Google Maps. If you specify like properties and availabilities and prices inside of your markup, this will actually be picked up without any additional work from your site by Google Maps or like other aggregators to display it on their site. And this is like really awesome, right, because you don't have to take care of managing content on multiple platforms and so on, but you can just use the integration. But what is now better than the other? And I would say it always depends. Usually I would prefer JSON-LD. The reason why that is you have like one blob of JSON at some point in your markup and you can easily debug that, because you can just read that out and see what kind of content you've provided there.

Dominik: 2:01

But this usually is also harder to implement, and it's especially like where it is easy to implement is on pages or for things like events or for entities that are predefined, that have a fixed structure. Then you usually know when you render a page or when you, when you render a page for this content, you have the entire, like all of the information already gathered at one place. You can say, okay, this is my event, and then you can just say, okay, eventstart date or whatever is, and I will just render this out into a JSON object. Then only the data that I actually want to display I will render in the markup like as a real HTML attributes or HTML elements and attributes, and thus you can like keep the markup really clean and separate the actual structure data from the markup. So this is why I would usually go that way, but in some situations it is a lot better, or easier at least as well, to have HTML attributes as structured data right or to define your structured data there. You can use micro data or RDFA, and it just is a difference of the naming of the attributes right and the URLs where the schemas are located, and a typical examples for that are like accordions, for example, because at the beginning you don't really know what kind of content, how many items there will be, and so on, right. So it's usually a lot easier and them might be a lot of content, like a lot of text, and then you would duplicate a lot of characters, like in your markup and then in JSON LD. So instead of extracting that and putting it into a JSON object, I would just add these attributes to the, to the HTML elements directly. Other examples are sliders or galleries, right, where you have multiple like elements, where you don't know how many there are.

Dominik: 4:01

It's also important to know that you can have also multiple JSON LD script tags on your website, like it doesn't only have to be one. So, for example, if you have an overview with events, let's say, I would probably put one script tag with, like the JSON LD data inside of each of the containers where you render the event and then I don't know maybe combine that with if the event, for example, has multiple images, you could think about OK, if you want to add this to this structured data, to the JSON LD, or combine that with micro data. However, now that I say that, I am not 100% sure if you can combine, like if this data gets merged right, I think it's one or the other. But you can have multiple things, like you can, for example, have an event on one like as a JSON LD, and then FAQ as like micro data on the same side, and this will be picked up. But I don't know if you can mix, for one event, data inside of JSON LD and micro data. So I'm this. I think I haven't tried.

Steffen: 5:12

OK, but so you basically say, like for data structures that are very predictable kind of, you would recommend actually like maybe that actually the micro data, when it's when the markup is predictable, let's say, like that, right, with an accordion, likely there's not much that will change in terms of the markup and the layout, so you don't have to pay a lot of attention. It's always the title kind of, and the content relationship is very clear, whereas when the representation is more complex then you might accidentally like mess with the markup and it gets much more complicated to make that work and keep track of that within the markup. And then it's better if you use JSON, because there you have a very structured format and you can prepare it, but at the same time you're kind of duplicating the content. Yes, it really depends there again, but what you said about the sliders, why is it that you don't like that to use? Then JSON objects when they're like multiple elements, couldn't you just like run a loop and generate that JSON in a loop?

Dominik: 6:25

Yes, yeah, you totally could like in. It's just like a preference, I guess. So I think, if it's, if it's something that again is part of a more structured data entity, I would probably put it in an array and just read it out. But if it's like a manual thing, right, if you have a manual slide and you can just fill it with images. Yeah, I think your explanation was actually quite good before.

Dominik: 6:51

When you have just simple, really simple structures, like in a, in an accordion, it's just like if you have a template, right, an accordion is always like consists of two things like the headline and the content, so you can easily just add this to the markup, to the, to the template, these attributes for slider, it's usually the same.

Dominik: 7:12

You have an alt text and you have like the URL of the image.

Dominik: 7:16

I guess and this is also quite easy to do I'm always like trying to balance the ease of implementation and then the correctness or like the quality in the end.

Dominik: 7:28

So in general, if it was really easy to implement all of that, I would probably say just always go with Jason, because this is usually like the cleaner way to do things and, again, easier to debug, but sometimes it's just a lot harder for, like if you are in a page builder and in Flint we are always like using kind of our own page builder with like ACF, flexible content right, and in order to get all the data you would have to loop through the entire content, like figure out, like what is the structure data, and there it's just a lot easier to go into a component and in this component just say what the structure data fields are and add this to the markup. You could, of course, use like some filter or like an additional component or like a module or feature to collect this data from every component that has been rendered and so on. But again, this just makes things more complicated. And then I find it's easier if in these components you see immediately what is used as what.

Steffen: 8:29

Ah, okay, see, like I thought until now that maybe it's more about the predictability of data, but actually I think it's rather about the nesting of data. Right, because the example that you gave was an event. You have all the event data available on a global level. You have that there right away, right, when you query the object, the field group, whatever it is, then you have that data and you can just directly put it out into a JSON object, whereas a slider, which might have nested additional fields like text on top and icons and options, yeah, you would need to do the rendering process of that actually twice, once for the markup, once for the JSON and then you have a lot more code and so on that you create. Yeah, it actually increases the complexity there, and then it's might be preferred to just shift that complexity to the markup or the slide complexity and then just use the typical schemaorg or Google validators to see if you got things right there with the markup. Yeah, Okay.

Steffen: 9:33

So that's the great that we found that out for ourselves. I hope anyone else will learn from this and let us know if you agree with that or if you have a different opinion or use different methods.

Dominik: 9:45

Yeah, would be really interesting to hear about that. Bye, bye.

Steffen: 9:49

Bye, bye, bye, bye, bye, bye, bye, bye, bye, bye, bye, bye, bye.