Twatter: April 2008

I had some strange erros when adding a user control

First of all it told me:

Element 'searchresults' is not a known element. This can occur if there is a compilation error in the Web site

The I tried to compile the user control and it said

The name 'lnavigation' does not exist in the current context

So I figured the variable wiring up wasnt working and added a definition for lnavigation manually. The I got.

The type 'referencesearchresultsusercontrol' already contains a definition for 'lnavigation'

Very frustrating.

Later I also got

The file 'src' is not a valid here because it doesn't expose a type in the register tag.

Turns out that the problem was that the register tag was the problem and I had to remove the ".cs" from the src attribute.

Change
<%@ Register TagPrefix="blahblah" TagName="searchresults" Src="searchresults.ascx.cs" %>

To
<%@ Register TagPrefix="blahblah" TagName="searchresults" Src="searchresults.ascx" >

I was trying to replace any non letter characters using regular expressions which turned out to be a bit of a pain when unicode / accented characters were used.

I ended up trying to match the stuff that I wanted and remove everything else. Regular expressions aren't really set up like this as there isn't really a "not" operator.

This page was very useful:

http://www.regular-expressions.info/unicode.html

This expression did the trick for me, it matches everything, but only replaces (with the match) matches that I wanted.

Regex.Replace(authors, @"(?(?\p{L}\p{M}*|[ ,;|-])|(?.))", "${allowed}", RegexOptions.Compiled | RegexOptions.Multiline);

\p{L} matches any letter character without a separate accent
\p{M} matches any accent
\P{L}\p{M}* matches any letter character with any number of accents (it is possible to have more than one)
[ ,;|-] matches any special characters that I wanted to keep
the all group matches everything
the allowed group matches characters that I want to keep
the not allowed group matches anything else
The first expression in an or (|) group is the one that is matched so is the allowed group matches then the not allowed doesn't.

"${allowed}" in the replace string replaces a match with the contents of the allowed group. Since everything is matched nothing remains of the original string. If an not allowed match is replaced there is nothing in the allowed group.

Some notes:
Accented characters in unicode can be represented by a single character (for legacy reasons) or as a combination of a base character and one or more accent characters.
Thus a single character on screen such as é can be represented by either one or two unicode characters.
It is thus not possible to match accented characters in the usual way using square brackets [].

Twatter

Tuesday, April 29, 2008

asp.net cs0103 cs0101

Monday, April 28, 2008

.Net Regular Expressions and accented / unicode characters

About Me

Blog Archive