I’m “repurposing” my blogs and wanted to move some entries from one micro.blog blog to another one. To do this I exported a WordPress XML file (from je.mostrom.eu) and imported it to the destination site (blog.mostrom.eu).
It seems like all entries were imported but not the photos, at least not all of them. In the original site there are about 100 photos that should have been imported, but I only see 9 at the destination. I waited for a while to see if the copying of photos is something done slowly in the background but the number of imported photos haven’t changed in an hour.
Sometimes the logs can give a clue about what went wrong. If not, do you have an example blog post with missing photos that we can take a closer look at?
I couldn’t find anything interesting in the logs, it basically says “import started”, “found image link” (but only a few of those), and “import done”. No errors at all.
Here is the first post where it goes wrong, Kime no kata | Bloggen and the original at Kime no kata | Jan Erik Moström . If you open the one at blog.mostrom you will see that the image link is pointing to the original site.
The posts before this one seem OK, but the images from all that comes later are missing.
Doesn’t seem to work, what happens is that the same photos that was downloaded the last time is now downloaded a second time (so the 9 photos are now duplicated). I removed the “Kimi no kata” post and it got replaced but the photo is linked the same way.
And it seem like the logic is
for each new imported post:
if there are attachments:
download attachment
if the imported post already exists:
skip it
else:
add new imported post
but I have no idea if it looks like this. In the logs I can see that there are no mentions of “found image” after the kimi no kata post.
Hmmm, this was interesting. Just for fun I decided to try an import to a wordpress site, checked the option to download media and change URLs to the new URL … and what happened was that all media was downloaded but all image links is pointing to the old site.
WordPress and micro.blog has nothing to do with each other, but both fail - in different ways - to do the import. Coincidence?
Interesting! Until @manton eventually gets to this, could we look at the XML for the Kimi no kata post for clues? Maybe if we compare it with one of the posts that imports successfully, we can figure out what’s going on.
Good idea! And I’m pretty sure I’ve found the problem, and in my opinion it’s a bad one. The import seems to assume that image links are in “html format”, and it doesn’t do anything if the image links are in “markdown format”.
In other words, the import tool downloads images and converts URLs for images that uses the ing-tag, but just ignore images that uses markdown. But these markdown links are rendered normally on the new site. So if someone decides to move their blog from one micro.blog to another, they might do as I did, download a WordPress XML file, import it to their new micro.blog, check that everything looks OK … which it does since that images are fetched from the old site … and then delete the old site. Boom, all images gone. I assume the same problem might happen for wordpress if Markdown is enabled. Based on these guesses I would say it’s potentially a bad bug.
I also noticed that for my “source blog”, je.mostrom.eu, contained some markdown links that uses the “micro.blog subdomain” of the blog. So the entries here uses two different domains for image link.
If I’m correct, this could be a problem for anyone writing in markdown and wants to move entries to another domain, or just simply change the domain name of the current blog.
Only the first image (the img element) gets uploaded to my blog. The second image (![]()) keeps the old URL (no upload).
Until @manton hopefully fixes this, and if you’re feeling adventurous, you could probably come up with a search-and-replace scheme as a workaround, converting the Markdown syntax in the XML file into HTML.
Thanks y’all for testing this! I’ve checked the code and confirmed the behavior. The reason it gets into trouble is because if the post already exists, it tries to update it, assuming that maybe something in the imported file changed. But that does cause it to run through the images again, creating duplicates.
Also, it assumes that WordPress files only have HTML, not Markdown, which is why Sven’s trick will work to not copy the images referenced in Markdown.
I think the fix for this will be to keep track of the original WordPress URL for an image, so we can avoid copying it again.
I also wish we had a “clear out unused uploaded files” button. That would be useful in a few different situations.
That’s good sleuthing. My (limited) understanding of Markdown in Wordpress is it’s reliant on any number of (some much more popular than other) plug ins that may store and treat the underlying database content and feed content quite differently. It’s pretty unusual to publish markdown instead of HTML inside of an RSS feed, as I understand it.
Deep sigh, I don’t know what to say … it didn’t completely work.
@manton I think I might have discovered another issue that shows up when I do something like this. I’ll try to describe:
I create a micro.blog site and it get the domain name x.micro.blog
I start populating the new site by adding post and photos, the photos are now referenced using the url x.micro.blog/upload/some_image.jpg.
I then change the domain to a custom domain, for example myworld.org
Here is the problem, I don’t think the URLs for the images get changed from x.micro.blog to myworld.org
I add more posts, images for these posts get a myworld.org URL.
If I now make an export the XML file contains both image links that starts with x.micro.blog and myworld.org
When imported to a new site only the myworld.org images get correctly linked in the new site. The x.micro.blog URL will continue to have a x.micro.blog URL. Could be a disaster if the x.micro.blog/myworld.org site is deleted.
The problem is caused what happens in step 3-4. I suspect the same problem will occur if I change from one custom domain to another.
Sigh, things doesn’t work for me at all. The images are being imported to the new site but the links still point to the old one (a few does, but not all). The same thing happens when I try with WordPress so there is that. I’m giving up on this … I need to do this some other way
also– I’m curious why choose the Wordpress export from Micro.blog instead of the Markdown one. I would think Markdown to Markdown would be easiest for MB to MB– I’m not really sure if that’s been a common use case though to consolidate or move blogs within the platform.