Mar 28 2009

Automated content gathering

Information is Power
How to collect data from several websites automatically?

Very often Web user uses several websites to gather some important information on regular basis and it is really annoying.

Just imagine how useful would be if this information are gathered render to proper format and sent via email at exact time

I have made couple of scripts for myself which are scanning websites (authorization required) and sends me a report email if something valuable appears.

I am thinking of web service that will allow to specify data sources and process rules for data collected but need some confirmation from people if this service is really needed.

Automated Data Collection service Specifications

  • Allow to collect data from websites which require authorization
  • Allow to present data in different formats xml, csv, txt, doc
  • Up to every minute check for data updates

Service is rather complex so gathering setup will require some developer attention.

How does it work?

Service could be divided into three main parts.

  • 1. Built in browser which can get website pages content
  • 2. Web page parser (unique for each page)
  • 3. Data structuring

The third part is the simplest. We already have all parsed data and we can do whatever we want to with it.

The second part is unique for all pages, we should teach the script to get required words, images, numbers from page html. Sometimes this might be a complex task.

The first part is most complex. Anti-bot features sometimes impossible to avoid… Like image captcha - in basic situation we can get the code from it automatically but in general it is too expensive.


Jan 30 2009

Freelancing Advantages and Disadvantages…

How to find out if the freelancer is good

World Wide Web made it possible to crush the World into the freelance.

Millions of people who can do anything a little can now call their selves a freelancer! What a freedom…

The synonym of the freelance is unprofessionalism. Please, Freelancers, forgive me! But that is true.

The first advantage which can be found on top is cheap service (by the way this is a first reason of unprofessionalism). If you go to a huge company and request a project you will be charged three – five times more than making project with a freelancer.

The best method for the management of projects is close personal communication with people and building a circle of people and then knowing what they can do and what they cannot.

Now there are lots of freelance communities where you can see each freelancer statistics, past projects and reviews. It helps but not very much.

Let me give you an example. One person had an idea to make a social networking site. Social network project must be very expandable and changeable. He finds a freelancer who did an excellent job. From the browser it was fine. He gave an excellent review and paid with pleasure. A few months later he realizes he needs to add several features and to make some changes. He finds another freelancer and get advice to drop out of his current system and to start from scratch. This site was great from the browser, but inside it was terrible. That is how freelance unprofessionalism works. The problem is that the client does not even know how to check the job. In big companies there are ethics of project building, rules of how it should be built.

Here is a piece of advice of how to find good freelancer. You should not just check reviews from past works but also check past clients. If there is at least one client who is a specialist in web and he gave excellent review to the freelancer – he is good.


Jan 30 2009

He wanted to make it cheap.

There are lots of free or very cheap sowtware on the web. Free CMS, blogs, forums etc…
Good designs on template monsters for cheap.

I have story for you about one sad client from Africa. He does not pro regarding web business so he did this way. He bought one shopping cart for $300. This cart was not exactly what he need and he went to developer studio for revision. Developer has estimated revision to $2500, there was not much to do there but shopping cart was old and unprofessional coded… The best thing in that shopping cart that there are fools that still buys it.
What did he do wrong?
1. Bought old hardly revisable software. (He couldn’t figure out which is good and which is not)

Here is another story.
One person from Europe took completely free and very flexible product catalog. This is extremely flexible software, but very slow and original installation takes 40-70mb on hard disk. He wanted to redesign and to add additional module. Developer has estimated this work to $5000. Because of flexibility in product details and other features the system became very complex.
What did he do wrong?
1. Took to complex thing to revise.
What is the conclusion? If you don’t know what to do, ask IT consultant!


Dec 8 2008

Input fields JavaScript validators

This post consists of useful validation JavaScripts which can be easily implemented in html.
Note: if you made a presubmit validation it does not mean that you don’t need it on your server side. JavaScript validation is only for usability purposes. It saves time and nerves of the visitor.

Email input is one of the most popular requested information now on the web. It is like “tell us you are a human not a bot” check.
Example with JavaScript Email validation

 

Please enter email:

You can download zip file with email validation JavaScript and html example


Dec 7 2008

How to make usable input forms or what is to respect your visitor

This article describes how to optimaze input fields or textareas for usability. And could be usefull for begginners in web site building

In web 2.0 era - form elements are essential part of almost each page on the web.
Registrations, Posting, Payment procedures and many many more places we can find a form.
The moment when sites were just a pages with information past. Now - web talks to people and this way it should be.

In this way - form processing should be extremely carefull of person time.

Here is some 5 basic rules of how forms should be built.

  • 1. Never loose entered form fields data!
    if you do this - person might not try to fill it out one more time
  • 2. Do not make huge forms. Less than 10 fields at a time.
    if you need more - make several steps
  • 3. Use comments for each field.
    if it is an email field or phone number - show the example
  • 4. Make good validation.
    Use javascript or Ajax to check if the textbox is correctly filled
  • 5. The form should be optimized to be filled and processed correctly from one - two tries!

Now let’s see more detailed tips on examples.

Registration form

How many fields are required for registration? Five? Six?
Who will give more?

I say 1 - 3.
Don’t try to ask visitor for all required information on reqgistration form!
This is annoying!

Ask email that is all… Just one field.
Send an email with futher instructions where person will be able to set password and other required fields.

At least you can ask person for unique login name with password and email confirmation set up on the next step.

So we have two ways to make a registration:
1. One input field with email

Please enter email:

Email should be like name@yourdomain.com

Make sure that this is a real email, after submition you will receive a letter with futher instructions

2. Three input fields - email or login and password with confirm password

Please enter email/login:

Email should be like name@yourdomain.com

Please enter password

Please confirm password

Make yourself a visitor and try to fill your forms.
Forget about password validation and made it incorrect. How it will be? one invalid password, two invalid password. That is all, visitor is lost.
Write what is right and what is wrong before error message occurs.


As I said above it is good to validate input fields not only on server side but also while filling the form.
Example with JavaScript Email validation

 

Please enter email:

You can download zip file with email validation JavaScript and html example


Dec 4 2008

Web site review. How to review any website?

This article explains how to get information about quality of any web site on the World Wide. It is written for new guys who just made their site and want to understand what they have done.

I – subjective view

Web site is characterized with technical side and with its design representation.
Technical side is the quality of source which is generated with your website. This is a correct HTML and bug absence in functionality and JavaScript. Also technical side corresponds to multi browser support.
Lets us start.

  • 1. You can check HTML and CSS validity on http://validator.w3.org/. This free online tool provides you ability to see if web site html is built according to w3c standarts. This is a very good thing if w3c will give you “This document was successfully checked as XHTML 1.0 Strict!” message (doctype should not necessary be XHTML).
  • 2. You can check how your website looks like in different browsers. There are number of online services which will generate snapshots of many browsers with opened web site.
  • 3. Testing! If web site is a static html representation of some information there is not much to test… Just to make sure that all links are working correctly and display what they supposed to display.
    If web site has complex functionality – testing could take much time. Each form should be filled with various values. Email sending, passwords, registrations, sending messages, uploading files and so on. Every button should be pressed several times! And yet this will not give a 100% change that everything works.

Design representations. This is how your site looks like. It should be pleasant for eyes. All valuable links should be easily found. All valuable information should be on right places and marked with color or font size. This estimation is more creative than technical part. Very often the author of web site does not see any mistakes in his web site. And for this reason it would be good to ask other people to make a review and live comment about “What is wrong and what is ok”.

II – multi subjective view

People! People should talk. Ask your friends and enemies (hope there are not much of them). After them there are many forums and blogs with close theme to your web site where you can post an invitation to review your web site. People from there will not spend a lot of time on your site but will give many design or usability comments which could be very valuable.

III – complex view

This is deep. This is all about web 2.0 – in short web 2.0 information fights for its place to be viewed. It means all page with some info or functionality should be monitored if people read it, find it valuable.
For now there are several ways to gather statistics about how much time user have spent on any page, how many time he has viewed it or whatever info which could be collected about this two – person + page. For example people from Canada do not like one page and this can be understood by Google analytics thus be fixed. Many issues can be found of what to do to make your site better learning people behavior on your site.


Nov 22 2008

Once I have visited my website…

Two months of work. 7 Variations of design drawned in archive.
Thousends of pixel-left pixel-right changes. And what is the result?

Standard design studio website with several pages and a contact form…Oh yes and one more Ajax quick contact form.
Now I imagine how people open up the home page and see what they have already seened for million times, may be in some different colors and texts but the idea is the same: “We propose everything you need…”.
And I think: “Is this all?? is this all I’m capable for??”.
Simple static broshure with several pages and two forms!
Information on this pages could be found everywhere wordlwide.
There is uniqueness? How to show up your self?


Nov 13 2008

Spam control methods interview

Hello everyone!

Do you like spam?

All this emails, posts, comments, playing on search engine rules or just trying to take our attention.

I don’t like spam. Of course there is no way to live without spam. Why?

It is simple. Millions of people don’t know what they need! I’m not joking. It is so. They need spam, they need to be shown everything in this world to make them choose. They don’t want to search. Just to rest behind the TV and wait until “Got it! That is what I wanted for my entire life!”

Do you know what you need? If yes, you hate spam!

Here is couple of things you can do on your contact page to prevent spam bots from filling it.

  • 1. Hidden field in the contact form
  • 2. JavaScript submit
  • 3. Ajax work

Now a little bit closer.

1. Hidden field in contact form

Many bots just scan html form looking text fields to fill them. They are likely don’t realize if this text field is hidden:

<input type=”text” name=”checker” value=”" style=”display:none” />

If bot fills out the form this “checker” field will be filled and on server side you can check it and stop processing if it is not empty.

2. JavaScript submit

Almost the same thing here.

you have a submit button

<input type=”submit” name=”Action” value=”Send” />

Spambots just found submit button and add it to other fields in their form. They are not actually push that button. So we will cheat them!

here is the example.

1. change submit button to

<input type=”button” name=”Action” value=”Send” onclick=”MyFunction(this.form)” />

2. add hidden field

<input type=”hidden” name=”BotCheater” id=”BotCheater” value=”-1″ />

it doesn’t matter how to name it. It is needed on server side to check

3. add a JavaScript to submit form

<script type=”text/javascript”>

function MyFunction(form)

{

document.getElementById(’BotCheater’).value=”this is not a bot!”;

form.submit();

}

</script>

And after that on server side process sending only if “BotChecker” value is “this is not a bot!”

3. Ajax

This one is harder than first two methods. Spambots don’t understand JavaScript so they will not be able to send your form if it can be sent only through Ajax.

That is all for the first time.

Place your comments if you have any questions.