User tagging cleanup rule not working as expected

steven_post · ‎03 Sep 2018

Hi,

I'm trying to set up user tagging on an application. The tag identifier rule matches the correct element, but the cleanup rule appears to be behaving erratically.

The 'user' matched is "Hello <first name> <last name>."

Thus I added a cleanup rule "Hello (.*)\.", however this still doesn't clean things up, I keep getting the complete string. Looking into the HTML source, I found that there are no real spaces, but nbsp (non-breaking-space), like this:

<td class="lightgrayTop">
	Hello&nbsp;Firstname&nbsp;Lastname.
</td>

However, when changing the regex to accomodate this, I get the 'Anonymous' users again.

The regex looks fine to me and regexpal seems to match it.

I have no idea how the user "Maria 'O Donnel" gets turned into 'maria' (all lowercase) here, nor why the whole "'O Donnel" part would fall off.

Any help is greatly appreciated.

Regards,

Steven

Edit: fixed html code shown.

steven_post · ‎04 Sep 2018

I did some further testing, the 'Anonymous' users were caused by the 'do-not-track' feature I had enabled in private browsing.

Using the following regex sort off works:

\s*Hello([^.]+)\.\s*

This causes the resulting user tag to be captured as:

&nbsp;Firstname&nbsp;Lastname

Thus with a leading space. The same goes for:

\s*Hello\s*([^.]+)\.\s*

So the \s doesn't match the nbsp.

Adding the nbsp to the regex doesn't work either:

\s*Hello.nbsp;([^.]+)\.\s*

So the best I can do now is a regex that includes the nbsp in the group (the first regex in this post).

Edit: removed the brackets around the first whitespace