Initialise External Javascript in Page Fragments

In the last couple of hours I was contemplating around maintainable, performant and non-obstrusive Javascript on a project I am working on.

Following situation:
- Much Inline Javascript
- Many JS and CSS files referenced
- No Frontend Deployment/Build Process in place
- Some Javascript values are set at page generation time by Server Side code
- Webserver is not configured yet correctly for setting expires & compression headers

Problems with this:
- Maintainability and separation of concerns is an issue with lots of inline JS
- Long Pageload because of many HTTP request
- Webserver takes big hit for all the static content loading
- Network is being stressed as well
- Javascript Logic is unobfuscated/not minimised and visible to the world

Tasks:

  • Externalize JS
  • Modularisation of JS Dependency Loading vs. all in one
  • Configuration of HTTP headers in Webserver
  • Identify Scripts / static content for CDN
  • Seamless Deployment/Build process from comfortable JS dev to the production code

The “challenge” was to externalise the inline Javascript.

For modularisation there are several tools to handle the dependency combination of Javascript files. Depending on the amount of Javascript and the usage metrics you can as well just pack all in one. Of course there are many other methods for combination and modularisation and you can as well just run your own method. So this is bascially just a matter of choosing the right tool / method and not that difficult. For compressing / minimisation the YUI compressor seems like state of the art.

Configuration of the correct headers is also just a – well – configuration task of setting meanigful caching control and expires headers and deflate compression for Apache2, so that the documents get compressed over the network. As for deploment and build process sprockets, ant or others are a valiable solution.

What I would like to share here is the solution for externalize Javascript that was initialised by server side code.

The basic idea builds upon this post by dustin diaz. So I am not gonna repeat his content instead if you are not familiar with private, public and privileged Objects in Javascript you might read his post first. As a starting point when combining multiple inline Javascript files we have to take care about namespaces.

//just namespacing
 
function registerNS(ns)
{
 var nsParts = ns.split(".");
 var root = window;
 
 for(var i=0; i

Ok so far so good, now we need to have a way initialise those external Javascript files with dynamic server-side variables:

//initialise a namespaced object
 
var MY = function() {
	var private_var;
	function private_method() {
		// do stuff here
	}
	return {
                init : function(var1) {
                    private_var=var1;
                },
		method_1 : function() {
			alert("method 1");
		},
		method_2 : function() {
			alert(private_var);
		}
	};
}();
 
MY.init("hey you!");
MY.method_1();
MY.method_2();

That worked out as well. But there’s a but. Our javascript will be referenced externally and along with the YUI Performance rules the script include tags will be at the bottom of the page. The initialisation code on the other hand is at the top of every page fragment/widget and might be executed at rendering time. To prevent timing problems the initialisation code therfore needs to configure an object that is already live:

//global js config obj
var runcfg = new Array();
 
//within html
var a = runcfg["details"] = new Array();
a["url"]="http://uebersoftware.com";
 
//external js
var MY = function() {
	var private_var;
        var myconfig = runcfg["details"];
	function private_method() {
		// do stuff here
	}
	return {
                init : function(var1) {
                    private_var=var1;
                },
		method_1 : function() {
			alert(myconfig ["url"] );
		}
	};
}();
 
// usage
MY.method_1();

This method allows us easy externalized the Javascript without much refactoring. The script code is namespaced and is still dynamically configurable.

Posted in javascript, Uncategorized | Tagged | Comments closed

JPA Best Practices and efficient research

Recently, in the course of a evaluation for a project to switch their persistence provider I was doing some JPA research and came across this exhaustive presentation from Carol McDonald: JPA Best Practices.

I am working with JPA now quite some time, but always found that the documentation is not very detailed for such an important subject. Good to find some coherent information and surprising to me that this presentation has only 1851views on Slideshare. Slideshare is actually a really good resource for  quality in-depth information on various technical subjects.

Think about it: its most likely a presentation that was actually presented in front of some people. The presenter has most likely done research on the topic, is experienced on the subject or at least tried out what he is presenting. Hence as a rule of thumb I assume that there is a higher probability of quality and researched information than you find on random places on the Internet.

Its a pity that those presentations often do not to have such a good ranking in the Google searches.

Another good resource in the last couple of months became stackoverflow.com. They made a few things right to encourage and control good answers – and questions. SO would never have the success if it would not have the social status / name branding aspect built into it. Contrary to Slideshare, here Google is up-to-date with their indexing.




Posted in JEE, Search | Tagged , | Comments closed

Firefox 3.5 hanging? slow? buggy?

Like many other users I recently had slow-down problems using the new firefox 3.5.x. Seems like the Skype Plugin was causing the issues. Disabling it worked for me. There is an interesting thread on the issue. Skype apparently is having problems making reasonable quality plugins for quite some time. Mozilla was aware of it but due to political reasons did not block the knowing troublesome plugin in the new version.

This feels really strange for such a company to piss off users just to please some exec’s at skype / ebay.

Posted in Distractions | Comments closed

Long development cycles of compiled languages – the real killer

Ever asked yourself what are the reasons for the rise of scripting  languages like PHP, Ruby, Groovy for web development in the last couple of years? Let’s take the concrete case of Java. Common critiques of the language and the API’s, and the J2EE/JEE frameworks were/are in a loose historical order:

  • performance – too slow: this was quite a while ago. JVM’s are very fast nowadays
  • bloated: still true for some aspects. E.g. no closures (still..)
  • too complex & heavyweight, meaning too many dependencies to setup and take care of. Especially for J2EE this was the case. JEE changed a lot here.

The list could go on, but is mostly centered on those arguments which are however mostly referring older releases of the language and its libraries. Nowadays JEE is actually very lightweight, has a great set of tools that speed up app development and is lighening fast. With the advance of tooling support, common IDE’s make it irrelevant if you have to generate setters and getters (bloated?) or not, and enable refactoring of code with a button click.

When you compare the (web)framework support between Java and interpreted scripting languages Java has learned a lot from past mistakes with J2EE 1.2-1.4. CoC (Convention over Configuration), DRY (Don’t repeat yourself), Scaffolding, etc. can be found in Java frameworks just as in RoR. I don’t think that there’s a general winner – this really depends on the task and situation at hand.

Then there’s dynamic types vs. static types. There are pro and cons for both here, I prefer static typed to detect errors early at development time.

I claim that the real reason for the continued success of scripting languages are the faster development cycles. Especially there is no compilation, un- and redeployment needed. This speeds up development considerably and enables more trial and error coding if you feel like – though better do white-box testing.

Apart from that I cannot see any real advantage of interpreted languages today especially in compare to Java. The JVM is some of the most advanced piece of technology we have today: robust, scales, understood and has many mature implementations on all platforms.

What is your preferred language and why?

Posted in java, programming languages, Web2.0 | Tagged , , | Comments closed

Scaling Enterprise Applications and all that jazz: Terracotta, GigaSpaces, and Azul

I like scaling and the architectures that attempt to solve those issues. Below I tried to bullet point 3 prominent players in this area, all solving scaling problems with different architectures at different levels.

  • Terracotta 3.1
    • Clustering JVM using Network attached Memory
    • Only the field-level changes are sent over the network
    • Uses TCP to communicate within cluster
    • Open Source and recently acquired EHCache
  • GigaSpaces
    • Cloud enabled Middleware Platform (PaaS)
    • Space Based Architecture” – inspired by JavaSpaces
    • Partitioning & Co-location as essence: Ulitmate goal: “share nothing architecture” – eliminate costs of copying
  • Azul Systems
    • Proxy JVM with transparent redeploy to Azul Hardware
    • Integrated hardware, kernel and JVM Design
    • Build their own Multicore System running their own Chips
    • Systems are high number of cores
    • Optimistic Thread Concurrency & Pauseless Garbage Collection Technology

Terracotta @ JavaOne
http://www.dailymotion.com/videox9jpam

Gigaspaces Highlevel:


Azul – Very technical Google TechTalk

YouTube Preview Image

Many new developments also in the cloud space. What is your favorite scaling technology?

Posted in Scalability, Terracotta | Tagged , , , , , | Comments closed

Discovered Quercus

Scalability, security, pooling, long lived connections , container services in general are all aspects where Java has a lot to offer. PHP on the other side is good for fast development cycles with a low maintenance infrastructure, request response orientend with not so much support for long lived connections, caching etc.

On one of my favorite sites on scaling you can find out all the tricks how popular SNS and web 2.0 sites overcame the drawbacks of using such a scripting language in the long term, which served them well at the start for a quick front-end result. Why drawbacks? I agree scripting languages are actually a trend, think of Groovy or RoR, however for a solid backend with good session data handling, distributed caching, management tools, security etc. I would still prefer a compiled language, statically typed, sandboxed with security features, with all the API’s and services like Java has to offer – or maybe C#.

So thinking of todays Social Networking Applications, most require a lot of session data a combined solution: scripting in the front-end and the JVM/JEE in the backend would maybe a good solution. The backend might be a messaging layer or business logic layer with domain oriented architecture or something else.

There it comes the guys from Caucho Resin, an appserver I have used actually once for a University project long long time ago, have implemented the complete PHP language in Java! They call it Quercus and as of the time of this post its version 3.1. Quercus is deployed as a standard webapp and can be run on any JEE container. A Drupal installation apparently benefited from a 4x improvements over standard mod_php. However for a fair performance comparison it has to be compared with op-code caches – i heard good things about XCache – (php code compiled and cached), which typically reduces server load and increases the speed of PHP code anywhere from 2-10 times. Also lighthttp maybe faster than apache.

Performance aside, more important to me is actually to test if I can integrate my php app with my JEE apps and share a Persistence layer /ORM cache as well as clustering aspects with e.g. Terracotta. I will defenetly follow up on this with some results of this technology “mash-up”.

Posted in java, php, Scalability | Tagged , , | Comments closed

Copyright reloaded

I found this article recently. Quite funny the copyright symbol designer sues e.g. RIAA for using his artwork whereas RIAA itself sees itself something like the holy grail of copyright.

Posted in Distractions | Tagged | Comments closed

FB Series: AJAX & Facebook debugging

The php facebook client has a nice setting to track the different http calls between your app and facebook. You can enable this with:

$GLOBALS['facebook_config']['debug'] = true;

This will actually insert some script code to the header of every page served, in order to render the tracking code, if there is something to render:

<script type="text/javascript">
var types = ['params', 'xml', 'php', 'sxml'];
function getStyle(elem, style) {
  if (elem.getStyle) {
    return elem.getStyle(style);
  } else {
    return elem.style[style];
  }
}
function setStyle(elem, style, value) {
  if (elem.setStyle) {
    elem.setStyle(style, value);
  } else {
    elem.style[style] = value;
  }
}
function toggleDisplay(id, type) {
  for (var i = 0; i < types.length; i++) {
    var t = types[i];
    var pre = document.getElementById(t + id);
    if (pre) {
      if (t != type || getStyle(pre, 'display') == 'block') {
        setStyle(pre, 'display', 'none');
      } else {
        setStyle(pre, 'display', 'block');
      }
    }
  }
  return false;
}
</script>

The catch is that the script code is always inserted, even if there were no facebook api calls involved. So the script code renders at the top for every page, also when its a JSON AJAX response. This means of course that the JSON will not be parsed on the browse – not good.

I now use other tools to track what requests are being sent back and forth instead between my application and facebook.

Posted in debugging, facebook | Tagged , | Comments closed

Internet-scale Java Web Applications

I am currently working on 2 application architectures. One is a PHP Facebook app (IFrame) with Postgresql in the backend, the other is a Glassfish/Jersey/Toplink/PostgreSql stack.

When reading the glowing web 2.0 tech stories in the news and sites like highscalability it seems like just about everyone requiring a “internet-scale” architecture is using MySQL, many are using stacks in the line of {Phyton,Django|PHP, Zend}{memcached/MySQL} and take advantage of the new offerings of Amazon or Google to push their infrastructure to the cloud ( Microsoft Azure is another big one, Sun has something cooking, and there are many smaller cloud service providers).

I actually also had in the back of my head to go for EC2 in the near future for my apps – thinking of EC2 just of a vserver with more/less power on demand.

However when thinking about it, I was not so sure anymore if the architecture I am using is even ready for the Cloud – and ready to scale.

When hearing the advocates of BigTable, traditional RDBMS are not suited for such endavours. Nowadays all the hype seems about simple data structures, like hashtables, and doing the Joins in a Application Layer. Another approach is to do sharding – divide the database into shards which are exact replicas of each other, direct e.g. usergroup x to shard z, ensuring that they mostly only need data from this shard.

Where do JEE technologies fit in those high-scalability scenarios and why not Postgres – is the transactional db a scalability killer?

Lets examinate my concrete questions for my 2 use cases:

a) MySQL vs. Postgres

The traditional PHP application goes within Apache with mod_php using process forking – so every request is basically a new php process. Very different from the concept of a container. This implies that in regards to data caching there is nothing out of the PHP box. Maybe not so astonishing anymore that PHP does not have connection pooling support – yes it just wouldn’t make sense. Quoting Rasumus Lersdorf, Creator of PHP from a 2002 interview:

A pool of connections has to be owned by a single process. Since most people use the Apache Web server, which is a multi-process pre-forking server, there is simply no way that PHP can do this connection pooling …
If/when the common architecture for PHP is a single-process multithreaded Web server, we might consider putting this functionality into PHP itself, but until that day it really doesn’t make much sense. Even Apache 2 is still going to be a multi-process server with each process being able to carry multiple threads. So we could potentially have a pool of connections in each process.

Connections to MySQL MyISAM storage engine are apparently only 4KB and quite cheap. On the other hand Oracle connections

every single connection takes up 5MB in NT4 for Oracle 8i 816

The truth is, that most of the MySQL-PostgresSQL comparisons found on Google are really outdated. Postgres made huge performance increases from in their last version 8 as well as MySQL had significantly improved their transactional INNODB engine. So in terms of performance it more depends on the optimal configuration and design than MySQL vs. Postgres. Both are good databases and after going over an excellend presentation “Scaling with Postgres” by Robert Treat given at the Percona Performance Conference 2009, I feel in good hands using Postgres.

b) Data caching

1) facebook app with PHP/Postgres

How would i cache to improve performance when i see that direct database access is taking too much. Well actually memcache can be used by many database systems, it just happens that a lot of people use it with MySQL, but it also integrates with Postgres and many more.

Besides memcached i am sure there are other distributed caches usable with PHP.

2) Glassfish/Jersey/Toplink/Postgres

Here I am using JEE JPA with Toplink Essentials. The later does not have clustering support – or at least no production quality. The open source Toplink code base EclipseLink 1.0 was last time i looked at it (ca. Jan 2009) a bit unstable.

So I guess I would have to look at other distributed caches. Fortunately the choices here are not too little – hibernate integrates with EHCache, OSCache to name a few. So I guess I do not have to worry too much about distributed caching for my JEE app right now.

c) Physical Infrastructure

My current vServer provider (which i can absolutly not recommend but this is another story.. ) charges about 100 CHF / a month for 1 GB (3GB burstable) RAM, 60GB Hd, 2 GHz Xeon processor. I am already a bit short of RAM at times, so the next bigger package is dedicated which starts 200 CHF / month , or more realistcally a 350 / month for a dual core Xeon and 4 GB of RAM.

From the Amazon Website:

“As an example, a medium sized website database might be 100 GB in size and expect to average 100 I/Os per second over the course of a month. This would translate to $10 per month in storage costs (100 GB x $0.10/month), and approximately $26 per month in request costs (~2.6 million seconds/month x 100 I/O per second * $0.10 per million I/O).”

Given my app is no video/media sharing the scenario would be a small instance always on, and moderate Elastic Block Storage (EBS) requirements for the data storage. This gives me a rough estimate using their handy calculator:

  • Small (1.7 GB RAM,..) Linux Instance (always on 1 month, 36$ EBS costs): 118 $
  • Large (7.5 GB RAM,..) Linux Instance (always on 1 month, 36$ EBS costs): 363 $

The instance types of EC2 also include high performance CPU instances and different OS. For me right now something between a small and a large instance would be ideal, just in terms of RAM. (I mean just for a single glassfish instance the recommended memory allocation is 1 GB ..). Maybe having 2 small instances would be the best solution in my case.

So overall I guess I will go with EC2. There are a bunch of articles, questions and comparisons out there that list all the pro and cons between dedicated servers, cloud providers, and vservers. Fact is that Amazon has been a leader in the Cloud space and improved their services constantly. Also the usablility with the Management Console has increased significantly.

d) Impacts of EC2 on Application Architecture / Clustering

On the WebServer / DNS tier EC2 offers Elastic Load Balancing. This is 1 public static ip adress per AWS account. The ip adresses of the instances will change upon reboot, but their only private so don’t have to worry about this. Furthermore the elastic ip feature implies a load balancing included for you to distribute load to the instances.

One problem with EC2 though is in the application tier because is there’s no multicast – makes sense when you think about the potential network flood it would possible generate. This s a problem, because most of the applications/frameworks/application servers usually rely on multicast for their clustering solutions – in order to the discovery of other service instances

I found a nice article on a Terracotta architecture solving this problem. Terracotta provides clustering and caching for Java objects by instrumenting the Java byte-codes and doing things like (pre)fetching content or updating copies. They do this via TCP/IP and therefore enable clustering and distributed caches that do not rely on multicast. What’s really cool is that they went recently OSS and you can download their software for free!

How does Terracotta work?

A few interesting quotes from their forum:

Every application node is connected to the Terracotta Server Cluster via a TCP connection. There is no multicast. Terracotta is very efficient over the network. Because it intercepts field-level changes, only the changes to your objects are sent across the wire. In addition, objects do not live everywhere, so Terracotta only sends changes where objects are resident. In the case where you have a well partitioned application, this means that on average, your changes will only be copied to the Terracotta Server Cluster, and not to all of the application nodes (because they don’t need a copy of objects they do not have a reference to in Heap)

Just because one has 1000 clients running the same application doesn’t mean all data is everywhere. One of the features of Terracotta is that it has a virtual heap. Objects are only where they need to be when they need to be there. Some users do have large numbers of clients and it works quite well. Scale is more of a question of concurrency and rate of change than number of clients.

The Terracotta server uses an efficient mechanism to send changes using Java NIO under the covers to achieve high scalability.

There are integrations with several App Servers, among them Glassfish. Yes!

Summary

Without further ado, my takeaways to this rather long post are:

Postgres does not per se underperform MySQL
Memecache can be used with Postgres
Do not use Persistent DB Connections in PHP ever
EC2 will fit my bill for infrastructure/hosting
Terracotta will be a good candidate for clustering in a EC2 environment without multicast
Hibernate with EHCache, JBoss Cache, OSCache is your distributed cache replacement for Toplink Essentials

Posted in EC2, Scalability, Terracotta, Web2.0 | Tagged , , , , | Comments closed

FB Series: Integrating JS


Where are we coming from

Javascript one of the most popular language for web programming. Although it has been around for decades, – coming from Netscape 1995 – it was not until the advent of the Ajax and Web 2.0 when JavaScript came to the spotlight and brought more professional programming attention.

The OpenSource movement of JavaScript frameworks started in 2005 with prototype and script.aculo.us. Since then programming appealing JavaScript based Websites has become so much easier. Nowadays its litteraly possible to mash-up widgets with little to no JavaScript knowledge and set up a stunning page. I would however still recommend for someone not knowing JavaScript first to get the basics and maybe have some JS Reference & DOM Reference handy.

Looking at the core functionality of todays frameworks what do we have?

  • DOM Traversal with CSS selectors to locate elements
  • DOM Modification: Create/remove/modify elements
  • Short syntax for adding/removing Events
  • Short syntax for Ajax Request
  • Animations (hide/show/toggle) & Transition
  • User Interface Widgets (Drag & drop, Tree, Grid, Datepicker, Model Dialog, Menu / Toolbar, Slider, Tabbed Pane)
  • Wide Browser Support

Choosing a JS Framework

On the net you will find a mirrad of comparisons of popular frameworks. The most popular OpenSource and free JS Frameworks available today are:

* Prototype (&Scriptaculous for UI)
* Dojo
* jQuery
* YUI
* Mootools

I would say that all of those have their advantages and drawbacks and there’s not really an obvious winner. It really depends on what you use it for. Some helpful links that helped me choosing are:

Wikipedia – to get a feeling on the features
unbiased functional comparison
Stackoverflow comparing JQuery Dojo and more
Performance comparison

Choosing a framework, as mentioned, depends a bit on how you want to use JavaScript:

Plug-and-Play:

  • Drop in a “calendar widget” or “tabbed navigation”
  • Little, to no, JavaScript experience required.
  • Just customize some options and go.
  • No flexibility.

=> Widgets

Some Assembly Required

  • Write common utilities
  • Click a link, load a page via Ajax
  • Build a dynamic menu
  • Creating interactive forms
  • Use pre-made code to distance yourself from browser bugs.
  • Flexible, until you hit a browser bug.

=> Libraries

Down-and-Dirty

  • Write all JavaScript code from scratch
  • Deal, directly, with browser bugs
  • Quirksmode is your lifeline
  • Excessively flexible, to the point of hinderance.

=> Raw Java Script

Of course you can also mix different approaches, as you can also mix Frameworks but of course in terms of maintainability and productivity it would better if you don’t have to.

For my personal use case I had the following requirements:

  1. Easy to learn with minimal intuitive syntax
  2. Lightweight solution
  3. Availability of some widgets: Datepicker, Grid, maybe more
  4. Appealing Web 2.0 effects
  5. Production quality

Right now I have choosen jQuery (incl. jQuery UI) and see how far I get with it. It was to me the most appealing in terms of the above criterias. I am not sure if I need a sophisticated data table, if so I might have another look at YUI data table which seems to me one of the best widgets for this (compare to e.g. this jQuery plugin).

For development i guess its essential to still have a good Javascript support in your IDE and a JavaScript debugger – I use Firebug and Netbeans for this. If you are into Eclipse I strongly suggest to take a look at Aptana, their IDE is really great for JavaScript and PHP, but unfortunately not optimal for Facebook development.

As a last hint: consider using Google Ajax Libs for speed up JS loading on your clients. However do not use it in local development. It just does not work reliable (loads too late etc.)

Posted in facebook, javascript | Tagged , | Comments closed