BlogEngine.NET

A 5-post collection

Short URL Rewriting BlogEngine.NET

Since beginning this blog one of my goals was to eventually rewrite the URLs. I've recently been successful. Here's how for version 1.5. When patching borrowed software I strive for minimal changes. This is also my first foray into this subject, so as usual "it works, but it may not be right".

This particular method uses the IIS7 URL Rewrite module (v1.1). If you're hosted on an earlier version of IIS and appreciate problem solving, request to be transferred to an IIS7 server if offered, or find another method. Thanks to IIS7's web application pool isolation this is possible even with shared hosting.

Assuming your host/you have successfully installed the URL Rewrite module, the rest will be easy (though figuring it out wasn't). Here is the <rewrite> rule section to add to your site's root Web.Config:

<system.webServer>
    <rewrite>
        <rules>
            <rule name="post redirect">
                <match url="^post/(.*)\.aspx$" />
                <action type="Redirect" url="{R:1}" redirectType="Found"/>
            </rule>
            <rule name="post rewrite">
                <match url="^([0-9]{4}/[0-9]{2}/[0-9]{2})/(.*)$" />
                <action type="Rewrite" url="post.aspx?date={R:1}&amp;slug={R:2}" />
            </rule>

            <rule name="blog redirect">
                <match url="^blog/(.*)$" />
                <conditions>
                    <add input="{REQUEST_URI}" matchType="Pattern" pattern="^/blog/admin/.*$" negate="true" />
                    <add input="{REQUEST_URI}" matchType="Pattern" pattern="^/blog/User controls/.*$" negate="true" />
                </conditions>
                <action type="Redirect" url="{R:1}" redirectType="Found"/>
            </rule>
            <rule name="blog rewrite">
                <match url="^(.*)$" />
                <conditions>
                    <add input="{REQUEST_URI}" matchType="Pattern" pattern="^/blog/admin/.*$" negate="true" />
                    <add input="{REQUEST_URI}" matchType="Pattern" pattern="^/blog/User controls/.*$" negate="true" />
                </conditions>
                <action type="Rewrite" url="blog/{R:1}" />
            </rule>
        </rules>
    </rewrite>
</system.webServer>
  • The blog redirect & rewrite rules strip out the /blog web application folder from the URL. If your virtual directory is named otherwise, you will need to make that adjustment in the match, condition, and rewrite URLs. The redirect rule ensures that anyone visiting the old URL is instead taken to the new one, and the rewrite URL makes the new one actually function.

    You will note that the /admin and /User controls (extensions) folders have been excluded from this substitution, and that is because it's necessary or they will cease functioning. (Though probably fixable, I haven't yet bothered.)

  • The post rules perform a similar redirect and rewrite, stripping out the /post subdirectory, and trailing .aspx suffix allowing post links to be in the following form: http://codeoptimism.com/2009/12/04/Short-URL-Rewriting-BlogEngineNET (/blog would be in there without the other rule).

Here's where things get interesting. You may be wondering why the post rewrite rule isn't simply adding /post and .aspx in a simplistic and peaceful reflection of the post redirect rule. Ah, if only it were so easy.

BlogEngine performs its own fun rewriting in BlogEngine.Core\Web\HttpModules\UrlRewrite.cs, and the Blog-Post.aspx pages themselves are actually faked. The true URLs are in the form http://codeoptimism.com/blog/post.aspx?id=686a72df-15cb-48bb-8f56-b40ffddb6af5. So why didn't they simply drop the .aspx themselves, or why can't I modify it to do so? I believe the answer is that without the .aspx extension the server fails to direct traffic to the web app whatsoever, and that's beyond my familiarity/access.

So there still shouldn't be a problem, the .aspx would be added by the rewrite rule, right? Wrong. The rewrite is a rewrite, not a redirect, it only affects the URLs appearance and you'd be in for a nasty 404. (And before you say it, using a redirect is square one, precluding the formatting we seek.)

I simply patched BlogEngine.Web\post.aspx.cs to accept my own (.aspx included) rewritable format: post.aspx?date=2009/12/04&slug=Short-URL-Rewriting-BlogEngineNET.

You could stop there, the URLs will be correct in the address bar, though not the links on the page. For those we need to tweak the RelativeLink property of the Post class in BlogEngine.Core\Post.cs and rebuild BlogEngine.Core.dll.

If you're not one to tweak the source code you may download an otherwise vanilla 1.5 copy of BlogEngine.Core.dll from me. Throw that in the /bin folder on your site. Lastly you may wish to replace instances of <%=Utils.AbsoluteWebRoot %> in your Theme files with http://yoursite.com. I only had to change the one on the logo in site.master myself.

If you're thinking, "Hm, I should patch AbsoluteWebRoot and RelativeWebRoot in BlogEngine.Core/Utils.cs!" Knock yourself out and comment here when you've both changes working. (Unfortunately more is required, so I left them be.)

Summary

  • Copy and paste <rewrite> section above to the <system.webServer> section in your site's root Web.Config file.
  • Put my post.aspx.cs in BlogEngine folder, or patch your own.
  • Put my BlogEngine.Core.dll in BlogEngine/bin folder, or patch Post.cs & build your own.
  • Replace relevant instances of AbsoluteWebRoot in your theme files with http://yoursite.com (no /blog) if you'd like.

Discussion

Defeating manual spam, or damn dastardly conniving commenters!

I'm not especially keen on meta-blog posts, but the issue came up in email recently and I've this penchant for expounding at some length on interesting subjects, even in the least suitable medium, target audience: one. Fortunately I have a blog.

Not so fortunately, manually entered spam has been an issue. When you optimize for humans you regrettably include manual spammers.

Such spam is surprisingly devious, but here are some common characteristics:

  • Complimentary. "Wow this is a great post!"
  • Relative. "I don't like spam."
  • Unconstructive. Adds nothing of value.
  • Disingenuous. "I appreciate it because…"
  • Erroneous. "…this should keep spam out of my email inbox."

See! Manual spam is recognizably crude.

ego spam

Of course the payload is the link, and they aren't all blatant advertisements, but even sites which might appear legit may advertise themselves unscrupulously. As expected they will lack real content.

Here with BlogEngine.NET the payload is put in the website field and not the body (the name field is the link text). Neither asking for your name nor indicating the website field is ignored by search engines made a difference.

Akismet did however, and thus far I've had zero false positives, only false negatives. Some have been crafted so cleverly as to be very close, but after investigating I've concurred. If Akismet marks a unique (but crude) comment as spam I expect the link to be unsatisfactory given that it's the defining constant.

My recent addition of reCaptcha seems to have made the largest difference. Most likely because there's now some difficulty involved. I actually feel pretty good about this because the duality of distinguishing computers from humans while simultaneously solving complex problems that computers do poorly absolutely fascinates me. Given that solving my captcha is now no longer a technical waste of time, I know some readers will begin to feel that it isn't as well.

Discussion

Akismet support for BlogEngine.NET 1.5

4 Oct 2009 Not sure why I didn't notice before, but the Commentor extension has been around for some time! It solves what I still needed, a place to manage all of the comments from a centralized location. The code below still adds spam management to individual comments and they may work together rather without incident.

Since I previously mentioned comment spam I've experimentally bolstered the defenses of this blog in my update to BlogEngine.NET 1.5, and… received the same spam. It's manually entered, and highly deceptive, frequently a "thanks, X helped me with Y" or just ever so subtly off-topic, often only the spam URL giving it away. At least this has absolved naive captcha of blame (still a little randomization in field names for playback bots might be a good idea).

So, BlogEngine.NET community, let's do something about the manual spam problem and integrate Akismet with a spam moderation queue like the WordPress plugin.

I've personally tackled it in my usual hackish fashion and converted BlogEngine's entire moderation queue into a spam queue (essentially moderation enabled but non-flagged spam is immediately approved). Who wants to have to moderate everything anyway?

The actual Akismet comment checking is accomplished with Joel Thom's ASP.NET API you may need to download.

Following that, here are the relevant comparison reports:

BlogEngine.Core 1.5 with Akismet

BlogEngine.Web 1.5 with Akismet

Joel.Net.Akismet.1.0.1 for BlogEngine.NET

Don't expect perfection, in particular my error handling is probably unduly sparse. I'm sure I'll notice it when it breaks painfully.

You will need a WordPress API key to work Akismet in the first place. Get one and specify it with your blog URL at the top of BlogEngine.Web\User controls\CommentView.ascx.cs.

Further you'll have to compile the changes to Core, update the dll in Web's bin, add a reference to Core in Joel.Net.Akismet, compile, copy dll assembly, and reference that from Web as well, but you already knew that because you're a programmer and the using references give it away anyway. Right? ;)

Or you could use this tidy package I've provided, keeping in mind to only replace files/assemblies you haven't modified from the stock download, and merge the rest. (If you've no interest in source code you may ignore the BlogEngine.Core and Joel.Net.Akismet.1.0.1 folders, the compiled assemblies are already in BlogEngine.Web.)

You must also Enable comment moderation.

Additional Details

  • Commenters are notified if their comment requires moderation immediately (the JavaScript now provides, very hackishly, an "isModerated" case).
  • Moderated comments have a new administrative Ham link for submitting false positives back to Akismet (this also approves the comment).
  • Approved (visible) comments have a new administrative Spam link for submitting false negatives back to Akismet (this also deletes the comment).
  • These links have been added to the corresponding admin comment notification emails as well.
  • Comment moderation has been toggled on in settings.xml (for XML data source blogs).
  • The Joel.Net.Akismet API has been modified to take the HttpRequest and BlogEngine.Core.Post directly.
  • Only labels.resx has been updated, this package is not localized (and could be a little cleaner for that), I'm but one English speaking man.

Please improve upon this, make it an extension (I don't think it can be 100%), or otherwise incorporate it into an official version.

Discussion

Implementing a naive captcha in BlogEngine.NET

4 Oct 2009 Keith Ratliff went to the very involved work of converting BlogEngine's comment submission process from JavaScript-centric to postback and standard ASP.NET validation, thereby enabling a more or less drag and drop installation of reCAPTCHA. Hooray Keith! Fantastic work. See that post instead.

A couple years ago Mad Kristensen implemented an invisible captcha into BlogEngine.NET, but as my blog attested to, this is not enough.

Instead of inconveniencing readers with a captcha, you can use your own clever validation trick. The more unique it is, the less likely it will be automatically discovered and circumvented. When it is, you need a new trick.

A naive captcha is basically a captcha that's always the same image, and works off of the principle that your site isn't important enough for spammers to manually specify (how cheerful!), but if it's good enough for Coding Horror it's good enough for me.

Of course being an image itself resists the automated discovery of this particular trick, and if it is discovered, manually or otherwise, it's easy to change the image (it doesn't even have to be of text).

Implementing my own naive captcha here has been quite effective so far. My next step may be Akismet for manually entered spam.

Implement your own

The patched (against vanilla BlogEngine.NET 1.4.5) files are available. For making the change to your existing and customized blog, take a look at this comparison courtesy of Beyond Compare 3, or view the compact version below; this post needed some color.

You'll want to change the paths and formatting in CommentView.ascx to suit your liking, also the word "chicken".

Oh, and don't forget that my code sucks. Someone please be my guest and make this a properly coded BlogEngine.NET extension. My first attempt was with the strictly-server-side RegularExpressionValidator control you see commented out below, which I couldn't get to work, so I used existing mechanisms instead.

Modified (check margin) lines are in red. Unimportant differences are in blue (mostly, the JavaScript isn't truly commented). The rest is context.

Discussion

.. and unto the public I emerge!

I recently wrote a Firefox extension of such usefulness that it has finally forced me to break my habit of never getting around to releasing any of my projects publically. So I purchased some ASP.NET hosting, installed BlogEngine.NET, configured the default theme to be less annoying, spent many hours pulling the Slug feature from the latest in-development verison of BlogEngine and adding it to the 1.1 release, and here we finally are. Slugs and all. Why didn't I just run the development version? Well, I tried, but had issues. I might still, or I could just wait until 1.2's official release planned for sometime in September.

It will be some time before the rest of my projects can showcase themselves, but at least now they will have a home.

And it will also be some time before this blog is precisely how I'd like it, but at least I have the power to shape it, and having already delved into BlogEngine's source code once, I should be in good shape myself.

Now let's look forward to the coming attractions. ;)

Discussion