Width and height is missing from mastodon posts in JSON

When parsing the JSON feeds, I’ve noticed that width and height attributes are missing from mastodon posts. But, they seem to present on posts that were natively posted to Micro.blog. My guess is that the native images are handled end-to-end by Micro.blog, while the mastodon images are fetched/cached only once a client actually requests the images.

But, without knowing the width/height of media elements, it’s difficult to display them without jankiness on the client side.

So, my question is: Would it be at all possible to parse the image dimensions from mastodon posts on the server side so that they can be properly forwarded to the client?

Mastodon post:

    {
      "id": "27499451",
      "content_html": "<p>I’ve only been playing with it for a few minutes, but Mammoth 2.0 looks *really* good.</p><p><img src="https: //cdn.micro.blog/photos/1000x/https%3A%2F%2Fcdn.masto.host%2Fnileanefr%2Fmedia_attachments%2Ffiles%2F111%2F540%2F133%2F730%2F058%2F778%2Foriginal%2Fee43cc9fbb195b47.jpeg" alt="Various screen of Mammoth 2.0 for Mastodon" loading="lazy"></p>",
      "url": "https://nileane.fr/@nileane/111540138666274506",
      "date_published": "2023-12-07T16:44:07+00:00",
      "author": {
        "name": "Niléane",
        "url": "https://nileane.fr/users/nileane",
        "avatar": "https://micro.blog/photos/200/https%3A%2F%2Fcdn.masto.host%2Fisfeelingsocial%2Fcache%2Faccounts%2Favatars%2F109%2F831%2F348%2F663%2F242%2F341%2Foriginal%2F2c8691b13b658097.jpeg",
        "_microblog": {
          "username": "nileane@nileane.fr"
        }
      },
      "_microblog": {
        "date_relative": "5:44 pm",
        "date_timestamp": 1701967447,
        "is_favorite": false,
        "is_bookmark": false,
        "is_deletable": false,
        "is_conversation": true,
        "is_linkpost": false,
        "is_mention": false,
        "note": ""
      }
    },

Micro.blog post:

Observe the attributes width="450" height="600" on the img element.

    {
      "id": "27497091",
      "content_html": "<p>Rush is staying at his tante’s house while we get Synneva ready and used to being at home. This is how it is going 🙃</p>\n<img src="https: //cdn.micro.blog/photos/1000x/https%3A%2F%2Fcdn.uploads.micro.blog%2F125928%2F2023%2Fimg-7954.jpg.jpeg" width="450" height="600" alt="" loading="lazy">\n",
      "url": "https://eschapp.micro.blog/2023/12/07/rush-is-staying.html",
      "date_published": "2023-12-07T15:57:59+00:00",
      "author": {
        "name": "Eric Schapp",
        "url": "https://eschapp.micro.blog/",
        "avatar": "https://www.gravatar.com/avatar/1d663bb90817f8bbb0b90dd7057e93e1?s=96&d=https%3A%2F%2Fmicro.blog%2Fimages%2Fblank_avatar.png",
        "_microblog": {
          "username": "eschapp"
        }
      },
      "_microblog": {
        "date_relative": "4:57 pm",
        "date_timestamp": 1701964679,
        "is_favorite": false,
        "is_bookmark": false,
        "is_deletable": false,
        "is_conversation": false,
        "is_linkpost": false,
        "is_mention": false,
        "note": ""
      }
    },

It’s not just a Mastodon thing. Micro.blog the app will add width and height attributes to the HTML. But plenty of folks (including me), don’t post their photos that way. As such, my photos (because of my own choice) don’t have width and height attributes in their HTML. It’s just a matter of what the content creation does.

I’m not sure about adjusting the HTML, but maybe we should get the width/height and include them in a separate JSON field for folks who want to conveniently access it?

IMO, if you wanted to add information from Micro.blog posts about image content, that makes some sense, but it feels very much like “not in Micro.blog’s scope” to parse out image content of HTML from outside sources to identify images and get attributes from them. But maybe that’s just me.

If the images were separated data and not just inlined in the content html that would feel different to me.

Yeah, that is fair. I’ll also double-check that Mastodon doesn’t already have the dimensions parsed out.