The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,861 other subscribers

Archive for the ‘SocialMedia’ Category

.NET/C#: Generating a WordPress posting categories page – part 1

Posted by jpluimers on 2012/07/31

From the category cloud it is hard to see that the categories are organized as a hierarchy. The combobox on the right shows that, but does not have room to properly show the hierarchy. Since WordPress.com does not allow you to deploy your own code, I worked around it in this way using a small .NET C# console program:

  1. Extract the HTML for the All Categories combobox on the right of the page.
  2. Convert that HTML to XHTML (and therefore XML)
  3. Generate XSD from that XML
  4. Generate C# class wrappers from the XSD

Future posts will show more logic on how to handle the imported information, and generate nice category overviews. Preliminary source code is at the BeSharp.net source repository.

Extract the HTML

The HTML is not fully accurate (see my post on HTML and XML escapes from last week), but it is fairly easy to extract. Most web browsers allow you to view the source of your web page. Do that, then search for “All Categories”. Now you see HTML like this:

</pre>
<h2 class="widgettitle">All categories</h2>
<pre><select class="postform" name="cat"><option value="-1">Select Category</option></select><select class="postform" name="cat"><option class="level-0" value="256">About  (66)</option></select><select class="postform" name="cat"><option class="level-1" value="64">   Personal  (60)</option></select><select class="postform" name="cat"><option class="level-2" value="20254983">      Adest Musica  (7)</option></select><select class="postform" name="cat"><option class="level-2" value="32122">      Certifications  (2)</option></select><select class="postform" name="cat">...</select><select class="postform" name="cat"><option class="level-0" value="756">Comics  (3)</option></select><select class="postform" name="cat"><option class="level-0" value="780">Development  (473)</option></select><select class="postform" name="cat"><option class="level-1" value="872460">   Database Development  (55)</option></select><select class="postform" name="cat">...</select><select class="postform" name="cat"><option class="level-0" value="9280">User Experience  (3)</option></select>

I don’t need the H2 heading line, but the rest I do need to generate XML from. I saved the HTML into a text file for processing by the console app.

Convert the HTML to XML

The HTML contains loads of &nbsp;, but XML does not allow for that entity. So the & ampersand needs to be escaped into &amp;This also solves other uses of & in the HTML. The rest of the HTML is XHTML compliant, so does not require change, which results into this C# conversion method:

        private static string toXml(string inputHtml)
        {
            string result = inputHtml.Replace("&", "&");
            return result;
        }

Generate an XSD for the XML, then amend the XSD

Given my comparison of tools for generating XSD from XML, so I used the XmlForAsp XML Schema generator, with the “Separate Complex Types” option. (Note: I will link to the XSD before/after, as WordPress – yet again – screws the XSD sourcecode in the post; this should do for now). That gives me XSD like this (XML is also at pastebin):

<?xml version="1.0" encoding="utf-8"?>
<xsd:schema attributeFormDefault="unqualified" elementFormDefault="qualified" version="1.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
 <xsd:element name="select" type="selectType" />
 <xsd:complexType name="selectType">
  <xsd:sequence>
   <xsd:element maxOccurs="unbounded" name="option" type="optionType" />
  </xsd:sequence>
  <xsd:attribute name="name" type="xsd:string" />
  <xsd:attribute name="id" type="xsd:string" />
  <xsd:attribute name="class" type="xsd:string" />
 </xsd:complexType>
 <xsd:complexType name="optionType">
  <xsd:attribute name="value" type="xsd:int" />
 </xsd:complexType>
</xsd:schema>

Which is not complete, but gives a good start. The actual XSD it needs to be like this with a more elaborate optionType complex type that also defines it’s own content as deriving from xsd:string, and adds the class attribute (XML is also at pastebin):

<?xml version="1.0" encoding="utf-8"?>
<xsd:schema attributeFormDefault="unqualified" elementFormDefault="qualified" version="1.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
 <xsd:element name="select" type="selectType" />
 <xsd:complexType name="selectType">
  <xsd:sequence>
   <xsd:element maxOccurs="unbounded" name="option" type="optionType" />
  </xsd:sequence>
  <xsd:attribute name="name" type="xsd:string" />
  <xsd:attribute name="id" type="xsd:string" />
  <xsd:attribute name="class" type="xsd:string" />
 </xsd:complexType>
 <xsd:complexType name="optionType">
  <xsd:simpleContent>
  <xsd:extension base="xsd:string">
   <xsd:attribute name="class" type="xsd:string" />
   <xsd:attribute name="value" type="xsd:int" />
  </xsd:extension>
 </xsd:simpleContent>
 </xsd:complexType>
</xsd:schema>

Generate C# classes from the XSD

You can generate C# wrapper classes using the XSD.exe tool that ships with Visual Studio, but XSD.exe is hard to use, is hard to integrate into Visual Studio (despite Microsoft Connect request for it), the XSD.exe generated code still needs work for deserializing, and XSD.exe has very limited generation options (heck, after it changed from .NET 1.x to 2.0, it hasn’t been updated for about a decade). XSD2Code has some great reviews, to I used that in stead. And indeed, very well integrates into Visual Studio 2010, and generates very nice C#, especially when you use the options (see also the screenshot on the right):

  • Under Serialization, set Enabled to True
  • Under Serialization, set GenerateXmlAttributes to True

That way, loading the HTML, converting it to XML, then deserializing it into object instances is as simple as this:

                string inputFileName = args[0];
                string inputHtml = getHtml(inputFileName);
                string xml = toXml(inputHtml);
                selectType select = selectType.Deserialize(xml);

More on actually working with the loaded instances in the next episode, including the great benefit of XSD2Code: it generates C# code as partial classes.

–jeroen

Posted in .NET, C#, C# 4.0, C# 5.0, Development, SocialMedia, Software Development, Usability, User Experience (ux), Web Development, WordPress, WordPress, XML, XML escapes, XML/XSD, XSD | 2 Comments »

XML and HTML escapes

Posted by jpluimers on 2012/07/26

While reviewing some client’s code, I noticed they were generating and parsing XML and HTML by hand (do not ever do that yourself!).

Before refactoring this into something that uses libraries that properly understand XML and HTML, I needed to assess some of the risks.

A major risk is to get the escaping (and unescaping) of XML and HTML right.

Time to finally organize some of my links on escaping HTML and XML that I had in my favourites list.

The starting point is the List of XML and HTML character entity references on Wikipedia. It is readable, complete and lists both kinds of escapes.

XML escapes

The official W3C text that describes XML escaping is hard to read.

There are only 5 predefined XML entities for characters that can (some must) be escaped. This table is derived from the Wikipedia article.

Name Character Unicode code point
(decimal)
Standard When to escape (from the XML 1.0 standard) Description
quot U+0022 (34) XML 1.0 To allow attribute values to contain both single and double quotes double quotation mark
amp & U+0026 (38) XML 1.0 Outside  comment, a processing instruction, or a CDATA section ampersand
apos U+0027 (39) XML 1.0 To allow attribute values to contain both single and double quotes apostrophe (= apostrophe-quote)
lt < U+003C (60) XML 1.0 Outside  comment, a processing instruction, or a CDATA section less-than sign
gt > U+003E (62) XML 1.0 in content, when that string is not marking the end of a CDATA section greater-than sign

HTML escapes

Read the rest of this entry »

Posted in " quot, & amp, > gt, < lt, ' apos, ASCII, Development, Encoding, HTML, Power User, SocialMedia, Software Development, Unicode, Web Development, WordPress, XML, XML escapes, XML/XSD | 1 Comment »

Android help needed: App that cleans up the Contacts mess that the LinkedIn app left behind

Posted by jpluimers on 2012/07/23

It seems LinkedIn doesn’t react to my tweet for help, so maybe you can provide me (paid) with a custom build of Contact Remover Plus.

This is my problem: https://twitter.com/jpluimers/status/225625536976793600

The LinkedIn Android app has created dozens of LinkedIn links and photo links to many of my contacts.
This renders many functions on my phone unusable:

  • loading the native People app takes minutes
  • Contacs cannot be synced my Google account any more
  • making  a backup fails
  • I cannot edit contacts in the Android People app any more
  • calling a phone number usually takes more than 30 seconds, often more than a minute
  • etc.

I tried to count this for one particular contact with the “Contact Remover” App: His contact has so many pages of photos and LinkedIn links that I cannot count them any more. They are at least dozens, probably more than 100 of each.

I’m using an HTC Sensation phone with a regular Android 4 firmware.

With Android development, I haven’t don’t anything yet with the Contacts API yet, and don’t have the time to do so in short term.

I’m willing to pay for an App that:

  • removes all LinkedIn links for all contacts
  • removes all LinkedIn photos for all contacts

Anyone?

–jeroen

Posted in About, Android Devices, Development, LinkedIn, Personal, Power User, SocialMedia, Software Development | 6 Comments »

#WordPress changed from Alt-Shift to Alt #keyboard #shortcuts and now breaks your regular browser Alt shortcuts. @wordpressdotcom

Posted by jpluimers on 2012/07/23

A couple of months ago, I created a nice post listing all #WordPress Editor #keyboard #shortcuts for both Windows and Mac OS X.

As of a few days ago, WordPress.com changed their Alt-Shift shortcuts into Alt shortcuts.

For instance, Alt-Shift-d (strike through) is now Alt-d, thereby blocking the original Alt-d (which for most browsers on Windows brings you to the address bar).

They violate one of the basic GUI principles: keep existing keyboard shortcuts as they are.

On Windows based browsers that means: keep Alt and Ctrl based shortcuts. Alt-Shift, Ctrl-Shift and Ctrl-Alt-Shift shortcuts are OK.

I haven’t tested WordPress on my MacBook air yet (as I don’t think the end-users should be the WordPress.com beta testers, though they probably think the world at large is a big beta-test garden).

I have asked WordPress.com to change the shortcuts back to what they were.

–jeroen

via: #WordPress Editor #keyboard #shortcuts « The Wiert Corner – irregular stream of Wiert stuff.

Posted in Keyboards and Keyboard Shortcuts, Power User, SocialMedia, WordPress | Leave a Comment »

The end of the classic ThinkPad Keyboard layout (#Lenovo #Fail)

Posted by jpluimers on 2012/05/18

(Thanks to a “Missed Post” problem on WordPress.com, this one didn’t get posted on the scheduled date. Sorry for any inconvenience)

First Lenovo did away with 1920×1200 screens. Now they done away with the ThinkPad keyboard layout.

Both were my compelling reasons for buying Lenovo.

In fact, they are now marked as forum.thinkpads.com • non-ThinkPad Lenovo Hardware.

New Lenovo X1 keyboard. No more ScrLk, Pause and local-menu keys, PrtScr key moved to impossible place. 6-key navigation split.

New Lenovo X1 keyboard. No more ScrLk, Pause and local-menu keys, PrtScr key moved to impossible place. 6-key navigation split.

–jeroen

PS: Anyone in The Netherlands who has a new ThinkPad W701 with 1920×1200 screen for sale?

Posted in Keyboards and Keyboard Shortcuts, Missed Schedule, Power User, SocialMedia, WordPress | 2 Comments »

Some research links on “change assemblyversion during checkin ccnet” – via Google Search

Posted by jpluimers on 2012/05/15

(Thanks to a “Missed Post” problem on WordPress.com, this one didn’t get posted on the scheduled date. Sorry for any inconvenience)

One of the next steps in the automated build process I’m setting up is increasing AssemblyVersion values after succesful builds.

It is is in a CCnet / TFS2010 / VS2010 environment.

Some links:

–jeroen

via: change assemblyversion during checkin ccnet – Google Search.

Posted in .NET, C#, Continuous Integration, CruiseControl.net, Development, Missed Schedule, SocialMedia, Software Development, Source Code Management, TFS (Team Foundation System), WordPress | Leave a Comment »

“Missed schedule” on post « WordPress.com Forums

Posted by jpluimers on 2012/05/15

Just checked my post history (as most posts are scheduled months in advance) just to see a bunch marked “Missed Schedule“:

20 user Inbound TCP connection limit in Windows 7 – Super User

Power UserWindows,Windows 7 21 hours ago
Missed schedule
On the research list: Kidi.Net: Kinderen VEILIG op het Net via Giovanni Praet (giovannipraet) on Twitter.

AboutPersonalPower User 2012/05/14
Missed schedule
How to Fix Temporary Profile in Windows 7

Power UserWindows,Windows 7 2012/05/11
Missed schedule

I tried this trick, but it didn’t help:

wget http://www.domain.com/wp-cron.php

Anyone who knows how to work around this?

Edit: posted on the forum, and contacted staff. But any ideas are still welcome.

–jeroen

via:

Posted in Missed Schedule, Power User, SocialMedia, WordPress | 2 Comments »

20 user Inbound TCP connection limit in Windows 7 – Super User

Posted by jpluimers on 2012/05/14

(Thanks to a “Missed Post” problem on WordPress.com, this one didn’t get posted on the scheduled date. Sorry for any inconvenience)

You need to be administrator to see the output of the “net config server” command.

The inbound/outbound limit is 20:

Running ‘net config server’ at the command-line suggests that Windows 7 can support up to 20 inbound / 20 outbound incomplete connections.

–jeroen

via: Inbound TCP connection limit in Windows 7 – Super User.

Posted in Missed Schedule, Power User, SocialMedia, Windows, Windows 7, WordPress | Leave a Comment »

On the research list: Kidi.Net: Kinderen VEILIG op het Net via Giovanni Praet (giovannipraet) on Twitter.

Posted by jpluimers on 2012/05/14

(Thanks to a “Missed Post” problem on WordPress.com, this one didn’t get posted on the scheduled date. Sorry for any inconvenience)

I must try to see if this is going to work with my mentally retarted brother: Kidi.Net: Kinderen VEILIG op het Net.

–jeroen

via: Giovanni Praet (giovannipraet) on Twitter.

Posted in About, Missed Schedule, Personal, Power User, SocialMedia, WordPress | Leave a Comment »

How to Fix Temporary Profile in Windows 7

Posted by jpluimers on 2012/05/11

(Thanks to a “Missed Post” problem on WordPress.com, this one didn’t get posted on the scheduled date. Sorry for any inconvenience)

You can fix the “temporary profile” in Windows 7 if you have access to the registry.

So it totally depends on how tight security at your clients is, and how fast their alternative processes are…

–jeroen

via: How to Fix Temporary Profile in Windows 7.

Posted in Missed Schedule, Power User, SocialMedia, Windows, Windows 7, WordPress | Leave a Comment »