Category Archives: Scripting

Tips, examples and discussions for Power Shell, Bash and Windows Command Prompt.

How to read an HTML page with UTF8 encoding using PowerShell

Yesterday I was working on a project that requires heavy HTML pages content scraping.

What I wrote were several PowerShell files which were scraping the content using Invoke-WebRequest and Invoke-RestMethod. And everything was great and smooth… until I got to an HTML page with some Greek letters inside. To my surprise both PowerShell built in functions failed miserably when I tried to retrieve those UTF8 encoded pages. In short, I was bombarded with ℠and ¢ here and there.

So, after I lost several hours trying to figure out what’s going on and experimenting with all kinds of options it turned out It’s impossible to read properly a UTF8 encoded page without BOM with Invoke-WebRequest.

Here is a simple function I wrote which uses .NET classes to tackle the problem.

Note that this is just a simple example and it lacks the extensible functionality you get with Invoke-WebRequest.

Reusing page jQuery in a Greasemonkey script

Many times I’ve used a RIA or a website and wanted it to behave differently or have an additional functionality that I need. Thanks to Greasemonkey this is possible.

We live in a jQueritifed world where most of the web pages use jQuery.

That’s why on of the things that I need to do most of the time when I write a userscript is to reuse the jQuery from the webpage without loading a new jQuery instance from the web. This will make my script much more responsive and who needs to load something that is already there.

Save your time by not reinventing the wheel and just use the boilerplate code I’m using for my Greasemonkey scripts.

Continue reading Reusing page jQuery in a Greasemonkey script