Migrating data from SQL into Windows Azure Table Storage
Monday, September 16. 2013
The error messages when Azure Table Storage data insert fails are far from being descriptive.
This is the complete list of supported datatypes (or Property Types as they call them):
- Binary: An array of bytes up to 64 KB in size.
- Bool: A Boolean value.
- DateTime: A 64-bit value expressed as UTC time. The supported range of values is 1/1/1601 to 12/31/9999.
- Double: A 64-bit floating point value.
- GUID: A 128-bit globally unique identifier.
- Int: A 32-bit integer.
- Int64: A 64-bit integer.
- String: A UTF-16-encoded value. String values can be up to 64 KB in size.
Really. Nothing more. You just have to get along with that one! 
The list is taken from Windows Azure Table Storage and Windows Azure SQL Database - Compared and Contrasted.
Things you fail to notice:
- .Net DateTime Structure as range of 00:00:00 (midnight), January 1, 0001 Anno Domini (Common Era) through 11:59:59 P.M., December 31, 9999 A.D. (C.E.) in the Gregorian calendar. Not from January 1, 1601 AD.
- That shouldn't be an issue. My app had problems and it had recorded dates into year 201. This was a really nice way of finding that out.  
- In intergers, there are no unsigned versions.
- In decimal numbers, there is no decimal, a 128-bit floating point number. You have to settle with Double, a IEC 60559:1989 (IEEE 754) compliant version.
- There is no reasonable way of storing money-type data which needs an exact number, no floating point conversions.
- The string really is UTF-16, a two byte -version. It stores up to 32768 characters.
- Which is Not much when compared to TEXT or varchar(max) which range from 2 GiB to anything you have
Hopefully this list helps somebody. I spent a nice while finding all these out.
Using PHP, Zend Framework, PDO and FreeTDS in Windows Azure
Wednesday, September 4. 2013
Earlier I wrote about IPv6-connectivity with MS SQL server into Linux / PHP with FreeTDS.
This time my quest with FreeTDS continued, I put together the minimal possible CentOS 6.4 Linux with enough parts to produce a Nginx / PHP-FPM / Windows Azure SQL Database -based web application. The acronym could be not LAMP, but NPFWASD. No idea how to pronounce "npf-wasd", though. 
I packaged a Hyper-V -based Linux .vhd into Azure virtual machine IaaS-image and created couple of load-balanced HTTP-ports into it. The problem was to lure PHP's PDO to connect into Azure SQL via FreeTDS dblib. I spent a good while banging my head and kicking it, before it stopped resisting and started to obey my commands.
Everything would have gone much better, if only I had the proper version of FreeTDS installed into the Linux. When I realized that the TDS-protocol version is hyper-important in Azure SQL, I realised that my FreeTDS-version was not the one it was supposed to be. My own-package would have been the correct one (see the earlier post). My tsql -C says:
Compile-time settings (established with the "configure" script)
Version: freetds v0.92.dev.20130721
freetds.conf directory: /etc
MS db-lib source compatibility: yes
Sybase binary compatibility: yes
Thread safety: yes
iconv library: yes
TDS version: 7.1
iODBC: no
unixodbc: yes
SSPI "trusted" logins: no
Kerberos: yes
The default TDS version of 7.1 is really, really important there. With that I can do:
tsql -H -my-designated-instance-in-Azure-.database.windows.net \
-p 1433 \
-U -the-application-SQL-user-without-admin-rights- \
-D -my-own-database-in-the-SQL-box-
It simply works, displays the prompt and everything works as it should be. In my Zend Framework application configuration I say:
resources.db.adapter = "Pdo_Mssql"
resources.db.params.host = "-my-designated-instance-in-Azure-.database.windows.net"
resources.db.params.dbname = "-my-own-database-in-the-SQL-box-"
resources.db.params.username = "-the-application-SQL-user-without-admin-rights-"
resources.db.params.password = "-oh-the-top-secret-passwrod-"
resources.db.params.version = "7.1"
resources.db.params.charset = "utf8"
resources.db.params.pdoType = "dblib"
No issues there. Everything works.
I received couple of comments from other people when I announced that I would try such a feat. It appeared that most people are running their own SQL-instances of various kinds because of performance reasons. The Azure SQL -service is definitely not the fastest there is. But what if you're not in a hurry. The service is there, easily available, cheap and functional, even from Linux/PHP.
What programming languages to learn?
Monday, August 26. 2013
This is a classic question which I get to answer a lot. N00bs know the answer, but somebody outside the IT-business might ask something like that. This is also quite a popular question among young people trying to figure out if programming would be for them.
Anyway, here 5 Programming Languages Everyone Should Know from two people who actually have created some of the most popular languages currently used.
Nobody should call themselves a professional if they knew only one language.
- Bjarne Stroustrup
Larry Wall
See his interview: http://youtu.be/LR8fQiskYII
His list:
- JavaScript
- Java
- Haskell
- C
- Perl
Perl is not a surprise in his list. He created it in the 80s. 
Bjarne Stroustrup
See his interview: http://youtu.be/NvWTnIoQZj4
His list:
- C++
- Java
- Python
- JavaScript
- C
- C#
Again, seeing C++ in his list is not a big surprise, he was one of the authors of the language in the 80s. The funny thing is that he mentions 6 languages.
Linus Torvalds
This two year old interview keeps popping up. In this video http://youtu.be/Aa55RKWZxxI mr. Linux mentions one programming language not to use. 
The again, this person is well known from his more than colorful opinions about various issues. But anyway his work on Linux kernel and Git version management are well known, he is a fan of C.
me
Being a blog-author I have to express an opinion of my own. To solely copy/paste opinions of three very skilled persons is too much of a cheap thing. So, here goes:
- C
- Pretty much all languages created after 1970 owe something to C, it is imperative to know this.
- JavaScript
- When doing any kind of web-stuff, this is the only language being used in 100% of the cases. All browsers run this and it is the de-facto client-side language today.
- C#
- Very versatile compiled language by Microsoft, has lot of influence from C, C++, Java, PHP, Perl, etc. the list goes on. It is mainly used with .Net to create server-side stuff.
- PHP
- IMHO the most important web-server language there is. This is wildly popular and shares similarity with C, JavaScript, Perl, Visual Basic, etc.
In addition to learning programming languages, I encourage everybody to learn also following widely popular frameworks:
- Microsoft .Net
- Zend Framework
My reasoning between this is that if you understand how they work, you're pretty well covered and also going to Python/Django or Ruby on Rails is much easier task. I know that these are web-frameworks and people program a lot other stuff than web, but sticking to the topic of what languages to learn, these are the first ones to try. There are so many other frameworks, especially in PHP-land, but they don't have such an essential position as the framework made by people who created the PHP-language. In Microsoftland there are no other significiant frameworks to learn. Anyway, both are properly documented and lot of information can be found of them.
Exploring Dijit.Editor of Dojo toolkit
Sunday, August 25. 2013
My favorite JavaScript-library Dojo has a very nice HTML-editor. During a project I even enhanced it a bit. Anyway the Dojo/Dijit-documentation is not the best in the world, so I'll demonstrate the three operating modes that are available. All of them have the same functionality, but how they appear visually to the person doing the HTML-editing differs.
Classic fixed size
This is the vanilla operating mode. From the beginning of time, a HTML <TEXTAREA> has been like this (without any formatting, of course). A fixed block-container of multi-line text editor which will scroll on overflow.
Example:
HTML for declarative instantiation:
<div id="terms-Editor" data-dojo-type="dijit.Editor"
height="150px"
data-dojo-props="extraPlugins:['insertanchor','viewsource']">
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
</div>
There is really nothing special, I'm just using two extra plugins insertanchor and viewsource to produce two new icons to the editor toolbar to add functionality for the end user. I found out that the plugins really need to be all lower-case for them to load properly. The class-names and filenames are in CamelCase, but putting them like that makes loading fail.
The obvious thing is that the editor is 150 pixels high. I didn't set the width, but since the editor is a simple div, any size can be set.
Auto-expanding
This is a plugin expansion for the previous. The only real difference is that this type of editor does not overflow. Ever. It keeps auto-expanding to any possible size to be able to display the entire text at once. During testing I found out that the auto-resize -code does not work on all browsers. There seems to be a discrepancy of exactly one line on for example Chrome. The bug manifests itself when you try to edit the last line, pretty much nothing is visible there. I didn't fix the bug, as I concluded that I won't be using this mode at all.
HTML for declarative instantiation:
<div id="terms-Editor" data-dojo-type="dijit.Editor"
height="" minHeight="100px"
data-dojo-props="extraPlugins:['alwaysshowtoolbar','insertanchor','viewsource']">
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
</div>
This has three things to be taken into notice:
- The auto-expansion is achieved by plugin called alwaysshowtoolbar. It does not work in current Dojo version of 1.9.1, I had to fix it. See the patch at the end of this post.
- It is absolutely critical to set the height="". Forget that and the alwaysshowtoolbar-plugin does not work.
- It is advisable to set a minimum height, in my example I'm using 100 pixels. The editor will be really slim, if there is no text. This sets the size into something visually appealing.
Manually resizable
This is how a <TEXTAREA> behaves on many modern browsers. When using plugin statusbar you'll get a handle to resize the block. During testing I found out that it is a bad idea to allow the user to be able to make the editor wider. I enhanced the class with additional parameter which gets passed to plugin's constructor to limit the ResizeHandle functionality.
Example:
HTML for declarative instantiation:
<div id="terms-Editor" data-dojo-type="dijit.Editor"
height="200px"
data-dojo-props="extraPlugins:[{name:'statusbar',resizeAxis:'y'},'insertanchor','viewsource']">
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
</div>
Note that specifying the resizeAxis won't work in your code. If you really want the code I can e-mail it to you, it is longish enough for me not to post it here. The height is 200 pixels initially, but user can make the editor any size.
Hope this helps to clarify the complexity of this fine editor. It would be also nice if a filed bug report would be processed in Dojo-project. Also their discussion in Google Groups is closed, the group is set to moderate any posts, but there is nobody doing that. So, effectively new people cannot post. There is nothing left, but to complain in a blog. 
Appendix 1:AlwaysShowToolbar.js patch to fix the plugin-loading
 
--- dojo/dijit/_editor/plugins/AlwaysShowToolbar.js.orig 2013-07-19 13:21:17.000000000 +0300
+++ dojo/dijit/_editor/plugins/AlwaysShowToolbar.js 2013-08-02 17:31:44.384216198 +0300
@@ -13,7 +13,7 @@
// module:
// dijit/_editor/plugins/AlwaysShowToolbar
- return declare("dijit._editor.plugins.AlwaysShowToolbar", _Plugin, {
+ var AlwaysShowToolbar = declare("dijit._editor.plugins.AlwaysShowToolbar", _Plugin, {
// summary:
// This plugin is required for Editors in auto-expand mode.
// It handles the auto-expansion as the user adds/deletes text,
@@ -198,4 +198,11 @@
}
});
+ // Register this plugin.
+ // For back-compat accept "alwaysshowtoolbar" (all lowercase) too, remove in 2.0
+ _Plugin.registry["alwaysshowtoolbar"] = _Plugin.registry["alwaysshowtoolbar"] = function(args){
+ return new AlwaysShowToolbar();
+ };
+
+ return AlwaysShowToolbar;
});
How not to behave as a member of software project
Friday, August 2. 2013
This is about how NOT to behave, writing about the opposite, how to behave, is way beyond me. I don't even know that myself. However, during my years in various companies and multitude of software projects, I've met lot of people. Some working quite effectively as members of a software project than others (best-of-the-best, if borrowing a quote from the movie Men in Black is allowed), and then there are the worst-of-the-worst. Today I share a story of such project member.
There is this project where I've been working as a contractor for almost 5 years now. There were some changes in the organization and I thought now would be about the time for me to do something else for a while. I discussed this with management, gave my notice and they started looking for new contractor, whom I promised to train before I leave.
Everything was fine until for my sins they gave me one (if borrowing a quote from the movie Apocalypse Now is allowed).
In the beginning it was pretty standard operations. My manager said, that there would be this new guy and he needed access to source code and ticketing. Pretty soon the new guy contacted me and asked for the credentials. I told him to hang tight so that I'd create a personal LDAP-account for him. I created the account, put in the needed groupings for a software developer -profile and handed them out to him. Nothing fancy there. Nothing that you wouldn't expect to see or do when arriving to a new assignment.
The next day he said something like "his code isn't that wonderful" to a colleague of mine. Naturally my colleague pretty soon told me what had happened. We've been working in the project together for a while and it was pretty normal reaction of him to say that the new guy is dissing your code there. I confronted the new guy and said that "Come on! We're supposed to be working together, why would you dis my work there." He surely knew how to set the initial impression. 
A couple of days passed by and my colleague comes to me again: "Did you see, that he posted his LDAP-account username and password to a public bulletin board?" There is so little to do in such a situation, except OMG!! and WTF!! Why would anybody do anything like that. This is yet again one thing way beyond me. Sharing your personal credentials with other people is grounds for termination of employment!
I don't know what's going to happen next. I informed my manager that I absolutely positively won't be working with this guy. He's obviously not qualified for this job and will most likely do more harm than good for the company. However, I've been getting a lot of visitors for my LinkedIn profile lately from his connections. Looks like I've made lot of new fans for my Fan Club.  Perhaps the confidentiality agreement doesn't apply to all of us? It is generally a bad idea to blabber about company's internal things with your contacts.
 Perhaps the confidentiality agreement doesn't apply to all of us? It is generally a bad idea to blabber about company's internal things with your contacts.
To recap:
- Don't criticize your colleagues' work behind their backs.
- If you absolutely need to criticize somebody's work, do it at him/her.
- Don't share passwords with anybody, they are meant to be kept secret.
- If you really know that a password can be shared, or you have permission to do so, then it's ok. When in doubt, don't do it.
- Don't blabber internal company or customers' issues to your friends.
- If you must do that, make sure you won't be caught doing so. When you get caught and get fired, remember I told you so.
- Take responsibility of everything you say and do. Really, that means about everything.
- This is much easier, when you say and do things people would expect somebody to say and do. If you go beyond the socially acceptable envelope, be prepared to take some heat for it.
- Then again, if you code looks like shit and works like shit, some people will
 call it shit. If you cannot quantify the results of your own work, then
 you're in shit. It is very unlikely that your work is the best there 
is. If you intentionally write code like shit and people call it shit, don't be surprised.
Sybase SQL and Microsoft SQL connectivity from Linux with FreeTDS library using IPv6
Monday, July 22. 2013
Microsoft SQL server is a fork of Sybase SQL server. This is because their co-operation at their early stages during end of 80s and beginning of 90s. For that reason the client protocol to access both servers is precisely the same TDS. There is an excellent open-source library of FreeTDS to access these SQL-servers from Linux. According to me and number of other sources in The Net, this library can also access Windows Azure SQL server.
During my own projects I was building a Linux-image for Azure. My development boxes are spread around geographically, and in this case the simplest solution was to open access into a firewall to allow incoming IPv6 TCP/1433 requests.
My tests with this setup failed. IPv6-access was ok, firewall was ok, a socket would open without problems but my application could not reach my development SQL-box. Bit of a tcpdumping revealed that my Hyper-V hosted Linux-box attempted to reach my SQL-box via IPv4. What?! What?! What?!
A quick browse into FreeTDS-code revealed that it had zero IPv6-related lines of code. According to Porting IPv4 applications to IPv6, there should be usage of struct sockaddr_in6 and/or struct in6_addr. In the latest stable version of FreeTDS there is none.
After a lot of Googling I found a reference from FreeTDS developers mailing list that in January 2013 Mr. Peter Deacon started working on IPv6-support. Naturally, this was good news to me. Another message in the ML said from February 2013 said that the IPv6-support would be working nicely. Yet another good thing.
Now all I had to do is find FreeTDS source code. I found somebody's Subversion copy of it, but with Google, no avail. The IPv6-patch nowere to be found, nor the actual source code. The mailing list itself seems to be having some sort of technical difficulties. My attempts to ask for further information seemed to go nowhere. I pretty much abandoned all hope when Mr. Frediano Ziglio was kind enough to inform me that the IPv6-support would be in the latest GIT-version of FreeTDS.
FreeTDS source code can be found from Gitorious at http://gitorious.org/freetds/freetds
I can confirm that the current Git-version does work with IPv6. However, for example PHP's PDO or Perl's DBI do not support entering IPv6-addresses into connect string. With FQDN I could confirm everything being IPv6 from Wireshark, but all my attempts of entering native IPv6-addresses into connect strings failed on both libraries and FreeTDS's CLI-tool tsql.
Anyway, here is what I did to test the thing. First I confimed that there is basic connectivity:
tsql -H myownserver.here -p 1433 -U sa
Password:
locale is "en_US.UTF-8"
locale charset is "UTF-8"
using default charset "UTF-8"
1> sp_help MyCoolTable
2> go
1> quit
Then I took a simple example from Perl Monks site and modified it to work (the original code was quite crappy):
#!/usr/bin/perl -wT --
# vim: tabstop=4 shiftwidth=4 softtabstop=4 expandtab:
use DBI;
use Data::Dumper; # For debugging
use strict;
use utf8;
my $dsn = 'DBI:Sybase:server=myownserver.here;database=MyCoolDatabase';
my $dbh = DBI->connect($dsn, "sa", 'lemmein!') or
die "unable to connect to server. Error: $DBI::errstr";
my $query = "SELECT * FROM MyCoolTable";
my $sth = $dbh->prepare($query) or
die "prepare failed. Error: $DBI::errstr";
$sth->execute() or
die "unable to execute query $query. Error: $DBI::errstr";
my $rows = 0;
while (my @first = $sth->fetchrow_array) {
++$rows;
print "Row: $rows\n";
foreach my $field (@first) {
print "field: $field\n";
}
}
print "$rows rows returned by query\n";
Also I did some complex testing with PHP DBO and had no issues. I even made sure from my firewall settings, that I could not accidentally access the SQL Server via IPv4. It just works perfectly! 
 If you need my src.rpm or pre-compiled packages, just drop a comment.
Parallels Plesk Panel 11 RPC API - reading DNS records
Tuesday, July 9. 2013
Getting Parallels Plesk Panel to do something without admin's interaction is not tricky. My favorite method of remote-controlling Plesk is via its RPC API. I am a co-author of Perl-implementation API::Plesk, which is available in CPAN.
All XML RPC -requests should be directed towards your Plesk-server at URL
 https://-your-plesk-box-here-:8443/enterprise/control/agent.php 
Raw XML
First we'll need to get the internal site ID of a domain. A request to get all the subscriptions looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<packet version="1.6.3.5">
<webspace>
<get>
<filter/>
<dataset>
<gen_info/>
</dataset>
</get>
</webspace>
</packet>
Note: It would have been possible to filter a specific subscription by domain name, but in this case we just wanted a list of all. 
A response to it will contain domain names and their Ids:
<?xml version="1.0" encoding="UTF-8"?>
<packet version="1.6.3.5">
<webspace>
<get>
<result>
<status>ok</status>
<filter-id>1</filter-id>
<id>1</id>
<data>
<gen_info>
<name>www.testdomain.org</name>
</gen_info>
</data>
</result>
</get>
</webspace>
</packet>
The response packet contains internal ID and name. We'll be using the internal ID of 1 to get all the DNS-records of the zone:
<?xml version="1.0" encoding="UTF-8"?>
<packet version="1.6.3.5">
<dns>
<get_rec>
<filter>
<site-id>1</site-id>
</filter>
</get_rec>
</dns>
</packet>
A response packet will look like this:
<?xml version="1.0" encoding="UTF-8"?>
<packet version="1.6.3.5">
<dns>
<get_rec>
<result>
<status>ok</status>
<id>111</id>
<data>
<site-id>1</site-id>
<type>CNAME</type>
<host>www.testdomain.org.</host>
<value>testdomain.org.</value>
<opt/>
</data>
</result>
</get_rec>
</dns>
</packet>
There seems not to be any other way of picking a specific record. A filter with type/name would be welcome. Any further operations would be done with the domain record's ID. In this case it is 111.
Perl-code
With a software library, the access is much easier. The same requests would be something like this in Perl:
my $plesk_client = API::Plesk->new('api_version' => '1.6.3.5',
'secret_key' => $plesk_api_key,
'url'=>'https://-your-plesk-box-here-:8443/enterprise/control/agent.php',
'debug' => 0);
$res = $plesk_client->webspace->get();
die "Subscriptions->get() failed!\n" . $res->error . "\n" if (!$res->is_success);
my @domains = @{$res->results()};
my $cnt = $#domains + 1;
for (my $idx = 0; $idx < $cnt; ++$idx) {
my $domainId = $domains[$idx]{"id"};
$domainId += 0; # toInt
my $res = $plesk_client->dns->get('site-id' => $domainId);
die "DNS->get() failed!\n" . $res->error . "\n" if (!$res->is_success);
my %dns = %{@{$res->results()}[0]};
print Dump::Dumper(%dns);
}
That is pretty much it.
Update (2nd Nov 2013)
 
To get all of the domains will require a two-step process (order does not matter): 1) get all the subscriptions (kind of main domains) and 2) get the other domains under subscriptions.
In my Perl-code I do it like this:
# NOTE: This is from the above code
# 1st round:
# Get all the subscriptions.
# There we have the "main" domains
$res = $plesk_client->webspace->get();
die "Subscriptions->get() failed!\n" . $res->error . "\n" if (!$res->is_success);
# NOTE: New one:
# 2nd round:
# Get all the sites.
# There we have the "non-main" domains
$res = $plesk_client->site->get();
die "Sites->get() failed!\n" . $res->error . "\n" if (!$res->is_success);
@domains = @{$res->results()};
In my case, the $res-hash is fed into a ExtractDomains()-function to get the details I need from them. If only the name is required, then no further processing is necessary.
Dojo 1.8 / 1.9 on Zend Framework 1
Wednesday, May 15. 2013
I'm a big Dojo fan. Its loading system makes it really fast on front-end. Also Dojo integrates well with Zend Framework.
ZF 1 is being phased out, but I haven't found the time to migrate into version 2 yet. Meanwhile Dojo / Dijit / Dojox will get updates, but they're not being compensated into ZF 1.
Here is my Zend Framework 1 patch to make Dijit components AMD-loading compatible. It makes Zend Framework Dijit-modules to use the slash-notation in paths. Especially in Dojo 1.9 using dots will yield errors like:
mixin #1 is not a callable constructor.
or
base class is not a callable constructor.
The errors vary depending of what you're calling. Pretty much your JavaScript ceases to execute. The problem comes from the fact that Dijit does not function exactly the same way it used to do before 1.9.
Failing example:
<div data-dojo-type="dijit.MenuSeparator"></div>
Working example:
<div data-dojo-type="dijit/MenuSeparator"></div>
 The difference is minimal, but makes everything tick again.
Dojo custom build
Monday, April 22. 2013
Dojo JavaScript framework has a nice system of packaging the library for your own app. During packaging you may minify the library, reduce the number of files being loaded and leave unnecessary parts out from it. However, ever since Dojo 1.7 the build system is pretty complex and documentation is almost non-existent. There is zero beginner documentation, the existing documentation is aimed towards those, who already know their way around.
The prerequisites for doing a Dojo build is Node.js and Java runtime. The rumour is that build would work with either one of those, but I most definitely cannot confirm that. My production and development boxes have CentOS 6.4, so initially I did not have either one of those installed. To comply with requirements, I installed my own build of Node.js 0.10.4 and for Java OpenJDK 1.7.0 (the package is called java-1.7.0-openjdk in CentOS).
My CentOS 6 RPMs of Node.js are available at http://opensource.hqcodeshop.com/Node.js/ if you need them.
Then to the Dojo-build. There is the IMHO crappy docs at http://dojotoolkit.org/reference-guide/1.8/build/. Most of the stuff I needed to figure out, I had to Google or look from the source. When you unpack the source-package you'll end up having an util/buildscripts/profiles/ directory, which does not exist in the release (minified) package.
A build profile is kind of a makefile. It instructs the build what to package and how. To my great surprise they changed the profile style and you'll find two different styles:
- Old style: 
- dependencies = { / A JavaScript object definition here / }
 
- New Style: 
- var profile = (function(){ / A JavaScript object definition here / });
 
A standard Dojo release build is done with profile named standard (no surprises there, huh?). The command for doing that would be, for example:
./util/buildscripts/build.sh profile=standard version=1.3.2-dev \
releaseName=dojo-release-1.3.2-dev cssOptimize=comments.keepLines \
optimize=shrinksafe.keepLines cssImportIgnore=../dijit.css action=release
I tried to emulate that with a new-style profile file of my own. The profile-file has most of the command-line parameter in it, so running it will be much simpler, copy the profile into profiles-directory and something like this will do:
./util/buildscripts/build.sh profile=Dojo-JaTu cssOptimize=comments.keepLines \
cssImportIgnore=../dijit.css action=release
There are number of choices you may do with the profile, for example you may choose not to minimize it, by changing following:
mini: false,
optimize: false,
layerOptimize: false,
This produces a built, but debuggable file which is much nearer to release than the source-package. You see, the build will replace number of options with structures like
if (1) { / then something / }, which initially look strange, but in reality just reflect the hard-coded changes you made during build. The release version will have those anyway, no matter which release version you'll use. Doing your own custom build, you'll have a control over which parts of the code are in and which are out.
I still haven't grasped the "layer"-concept fully. A layer is a single file containing a number of Dojo-modules. Anyway, that definitely is something worth studying. It will yield much faster loading web pages.
Trying to wrangle Dojo and struggling with its build system took me a nice working week. That was time well spent. Now I can make my own tailored Dojo-packages for a production site which loads really fast.
AbyssGuard 1.7.7 PHP source code de-obfuscated
Thursday, April 11. 2013
"Project Honey Pot is the first and only distributed system for identifying spammers and the spambots they use to scrape addresses from your website." (Direct quote from their website.)
Utilizing the results from the project is pretty straightforward, just get an existing library and start using it to check incoming IP-addresses. One of the PHP-libraries is AbyssGuard. It is distributed under GPLv3 for personal use. Being an open-source fan I naturally like to exercise my GPLv3-given right to modify and distribute modified copies of the original work.
However, in this case the author chose not to distribute the source code. WTF?! It appears that the only format he chose to distribute the project is in obfuscated code. I don't much care about PHP's eval()-function and like to configure my PHP with disable_functions=eval whenever possible, so this piece won't even run on my box.
I did the only reasonable thing an open-source loving PHP-coder would do. I de-obfuscated the code and distributed it on my site with appropriate GPLv3-required notification about it. So feel free, it is at http://opensource.hqcodeshop.com/AbyssGuard/ for you to get it.
MySQL 5.6 subquery ORDER BY behaviour - fixed
Thursday, February 28. 2013
Gillian from Oracle informed me that my query is not valid SQL and the 5.5 version worked just because I was lucky.
The correct way of using aggregate function count() is something like this:
SELECT mlh.changedate, mlh_latest.counts, mlh.level
FROM memberlevelhistory mlh
INNER JOIN (
SELECT member, MAX(changedate) as maxdate, COUNT(changedate) as counts
FROM memberlevelhistory
WHERE member = 5
AND approved <> 'N'
) AS mlh_latest ON mlh.member = mlh_latest.member AND mlh.changedate = mlh_latest.maxdate
WHERE mlh.member = 5
AND mlh.approved <> 'N';
Now the result is equally correct on both tested versions. 
MySQL 5.6 subquery ORDER BY behaviour changed from 5.5
Wednesday, February 27. 2013
MySQL 5.6.10 handles INNER JOIN / subquery -pair differently than 5.5.29. I found out this by accident when working code ceased to return proper results.
Example setup, a very simple table and couple of rows:
CREATE TABLE `memberlevelhistory` (
`member` tinyint(3) unsigned NOT NULL,
`changedate` date NOT NULL,
`level` int(10) unsigned NOT NULL,
`approved` char(1) NOT NULL,
PRIMARY KEY (`changedate`,`member`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `memberlevelhistory`
(`member`, `changedate`, `level`, `approved`)
VALUES
(5, '2009-08-01', 1, 'Y'),
(5, '2009-08-27', 2, 'Y'),
(5, '2009-10-01', 4, 'Y'),
(5, '2010-01-01', 5, 'Y'),
(5, '2010-02-01', 8, 'Y'),
(5, '2010-03-15', 9, 'Y'),
(5, '2011-02-01', 11, 'Y'),
(5, '2011-05-01', 12, 'Y'),
(5, '2012-02-01', 13, 'Y'),
(5, '2012-03-01', 14, 'Y'),
(5, '2012-04-01', 15, 'Y');
Description of columns:
- member: user ID
- changedate: when member lever was changed
- level: user level
- approved: level change approved by administration 
The idea of the table is that most recent approved level is user's current level.
Example query to get user's current approved level with total number of approved user levels:
SELECT mlh.changedate, count(*), mlh.level
FROM `memberlevelhistory` mlh
INNER JOIN (
SELECT member, changedate, level
FROM `memberlevelhistory`
WHERE member = 5
AND approved <> 'N'
ORDER BY `changedate` DESC
) AS `mlh2` ON mlh.member = mlh2.member AND mlh.changedate = mlh2.changedate
WHERE mlh.member = 5
AND mlh.approved <> 'N'
MySQL 5.5 result, current level as expected:
+------------+----------+-------+
| changedate | count(*) | level |
+------------+----------+-------+
| 2012-04-01 | 11 | 15 |
+------------+----------+-------+
1 row in set (0.00 sec)
MySQL 5.6 result, a surprise here:
+------------+----------+-------+
| changedate | count(*) | level |
+------------+----------+-------+
| 2009-08-01 | 11 | 1 |
+------------+----------+-------+
1 row in set (0.00 sec)
The query behaviour has changed. The subquery ORDER BY -clause has no effect. I did solve the problem of latest level with LIMIT 1 in the subquery, but it ruins the COUNT(*). I'm still working to replicate the 5.5 result in a single query, if a solution can be found, I'll blog about it.
SQLite extension-functions RPM-packaged
Monday, February 25. 2013
SQLite has very little support for typical arithmetic functions. In the SQLite contrib-section there is an extension for that by Liam Healy. The description goes:
Provide mathematical and string extension functions for SQL queries using the loadable extensions mechanism.
- Math: acos, asin, atan, atn2, atan2, acosh, asinh, atanh, difference, degrees, radians, cos, sin, tan, cot, cosh, sinh, tanh, coth, exp, log, log10, power, sign, sqrt, square, ceil, floor, pi
- String: replicate, charindex, leftstr, rightstr, ltrim, rtrim, trim, replace, reverse, proper, padl, padr, padc, strfilter
- Aggregate: stdev, variance, mode, median, lower_quartile, upper_quartile
To ease the installation, I packaged into into a RPM for CentOS 6:
The source-RPM will build quite easily on any RPM-disto. There are no weird dependecies or anything. 
Example usage:
# sqlite3
SQLite version 3.6.20
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .load libsqlitefunctions.so
sqlite> select floor(1.9);
1
sqlite> select ceil(2.1);
3
sqlite> select reverse("reverse");
esrever
sqlite> .quit
Now developing apps with SQLite back-end is much easier.
MIT 2013 Mystery Hunt - A Regular Crossword
Friday, February 8. 2013
The puzzle at Mystery Hunt site. A PDF-version of it, to solve it with paper and pencil.
I wrote the regexp-puzzle into a HTML / JavaScript -page, go see it at http://opensource.hqcodeshop.com/grid/. It will display the regexp with red colour if it does not match.
This one is really tough to solve. Good luck!
Back-ported hash_pbkdf2() from PHP 5.5
Tuesday, January 22. 2013
PHP has been lacking properly implemented password-hash function. Many web-sites really would benefit from having such a thing available. Zend Framework -guys implemented that into their ZF2. Nice! But for us not running ZF2, doing 1000 hashes in a loop with PHP-code does not sound like a good idea.
Initially I thought that Mcrypt-project would implement PBKDF2 and PHP would gain the function that way. Apparently they're not interested either.
The good news comes from PHP-project. They implemented hash_pbkdf() into native PHP. Great! The problem is, that PHP 5.5.0 has not been released yet. I didn't want to wait and back-ported the function from PHP 5.5.0 source tree into my own 5.4.11.
For those wanting to build their own, the patch is here: php-5.4.11-pbkdf2.patch
The test from PHP manual page:
$hash = hash_pbkdf2("sha256", $password, $salt, 1, 20);
echo $hash . "\n";
Yields exactly correct result: 120fb6cffcf8b32c43e7
Doing only 1 round is very naive. The recommended minimum is 1000 and apparently 2000 is the way to go. I took Zend Framework's Zend\Crypt\Key\Derivation\Pbkdf2 as a reference and did 2000 rounds instead of 20. Both algoritms return exactly the same result, though they handle the length-parameter differently. ZF2 assumes bytes, but PHP's native version assumes hex-string length. But I did iron out the difference in my code.
The native version does 2000 rounds in 0.00674 seconds, and native PHP-version does that in 0.012470 seconds, so C-compiled binary is 100% faster.
My test code for native version:
<?php
$password = "password";
$salt = "salt";
$now = microtime(true);
$hash = hash_pbkdf2("sha256", $password, $salt, 2000, 20);
$dura = microtime(true) - $now;
echo $hash . "\n";
echo sprintf("%12.11F", $dura) . " seconds\n";
?>
My test code for Zend Framework 2 version:
<?php
require_once 'Hmac.php';
require_once 'Pbkdf2.php';
$password = "password";
$salt = "salt";
$now = microtime(true);
$hash = Zend\Crypt\Key\Derivation\Pbkdf2::calc("sha256", $password, $salt, 2000, 10);
$dura = microtime(true) - $now;
echo bin2hex($hash) . "\n";
echo sprintf("%12.11F", $dura) . " seconds\n";
?>
If you're site is not storing passwords properly, its about time to start now.



