12.14.05

How does StringAssert.Equals work?

Posted in C Sharp at 6:33 am by Frank

While trying some futures of unit testing in Visual Studio 2005 team suite, I never suppose StringAssert.Equals(“abcde”, “fgdhi”) will pass.  It  puzzles me  a lot.  And  I have to  use  Assert.AreEqual  in the end.  How  does it work? It seems useless and will often confuses testers.

12.10.05

Heights, it’s been a long day

Posted in Movies at 6:32 am by Frank

Heights is a highly rcommended movie on imdb. It tells about five New Yorkers who were finally connected as the story goes on. All these things happenned in 24 hours. Isabel is a telented 25-year-old photograhper. She is so self envolved that she even doesn’t discover that his fiance is gay and is in a relationship with a guy. And she doesn’t know the name of the man who talked with her for throughout her mother’s party and saved her at the subway entrance. Quite a few audience will be very disappointed at end of the movie when they find out that all the sotry is caused about a closeted guy, Jonathan. He lies to Isabel. He wants to marry her, so he keeps his past from here. But all these secrets get unsealed in the end. Alic is a great guy. He is in love with Jonathan and he is Frank to him, and he sacrifices much to him.

Just as the plot outline on imdb about this movie says, all these people must choose their destiny before the sun comes up the next day. Will Isabel be together with his hero, Ian? When she read the name of Ian in the hospital, I really wish that she would learn not to be so self-envolved and stop ignoring the things around her in the future any more. How about Jonathan and Alec? They said to be start again. Is it possible for two guys like them to end up together in the real life?

This is really a great movie, showing us many places in New York, which impresses me most of them are the squirrels under the tree beside the street and the dozens of chairs in the city somewhere. I’ve never seen a scene of so many movable chairs in any cities.

12.08.05

implementing a simple net-spider

Posted in Python at 1:02 am by Frank

It was years ago that I heard about net-spider thing. It downloads a html soucecode of an url and then find some useful informations from it, and then recursively processes its sub-urls the same way. Google use a spider to download many site pages into databases so they can search a web content from their database for us. Junk mail sending programs use spider to find out the e-mail addresses from the internet and then send junk mails to these addresses. So, it’s very interesting to code a spider of my own. I implemented a sipder with python. It’s a very simple one, but it works. After I ran it excitedly for some time I found there was a very big problem with my spider. Surely that it’s not efficient as a real-life spider, but this is not my problem here. I want to say that it will cost a lot of memory after running for some time, more than 300M. Since this blog uses css which will cause unreadability of souce codes, I post my spilder.py as an attach, it can be downloaded and openned from inside a web browser. It’s very readable. And I will give some brief coments about how my spider.py works.

I use the spider to find out how many words there are in msnd2.microsoft.com, which is the online documents of .net 2.0. There are 5 classes in total, as shown below:
spider.png

SynchronizedObject class just has a lock() and unlock() method, which can be thread-safe.

FetchThread class does the following things in a row in a separate thread:
1.downloads the html of its url
2.extracts the content of the html and add them to WordCounter
3.extracts sub-urls from the html and add them to NewLinks
All the steps can be seen in the souce code of FetchThread.run().

ThreadList class contains a list of threads. The max count of the threads is defined by __maxcount, which can be set from its constructor. In my spider.py I define an max_thread_count as 20. GetCount() method returns the current count of __threads. GetThread() method sees if there are any new urls available in NewLinks, if there are, it will spawn a thread for each of them. removeThread() remove a thread from __threads list when a thread finished doing its job.

WordCounter receives some text and splites it into words and save each word and their corresponding occurence times.

NewLinks save all the urls in it. I save each url for its url adress, level and status. Status 0 means new url, 1 means in process, 2 means processed. getCount() method return the count of unprecessed urls in it.

I use a timer to check whether there are threads running, and whether there are more urls unprecessed. If there are, I use thread_list.getThread() to spawn some threads to spider the ulrs.

I suspect it’s the huge amount of string garbage that make it suck up so much memory. But I’m not so sure. Perhaps some other reasons. I didn’t get any clues.

The ariginal script file is here. Get it and change the file name to “spider.py”

12.07.05

This pic explains why ruby on rails.

Posted in Java, Uncategorized at 9:31 am by Frank

I always choose python as my script language but ruby on rails is very hot these days, so many people turn their head to ruby, me included. Some guys said that ruby is more oo then python. After reading a few ruby docs, I found that ruby is kinda like basic, use end to mark the ending of a block, and its syntax seems very weird to me. So I didn’t go with it any deeper. But this picture seems very funny and persuadable, I can’t help looking through ruby docs again.

12.06.05

Tried NHibernate

Posted in C Sharp at 3:32 am by Frank

Hibernate is so cool. So when I heard of its .net clone NHibernate I couldn’t wait to try have a look at it. NHibernate is not so adult as hibernate. Now its newest realease is 1.01. It should be stable I think. When I google on the internet to see if there are some good materials to learn it. I found a quickstart yestoday, and I went through it after a few hours as I am not familiar enough with some .net conceptions. And today I found an article at codeproject named nhibernate in real world applications. It seems very helpful. When I downloaded it and ran it in the visual studio, I got errors. And I found some people met the same error with me in the comments. When I read their discussion I settled it. The problem is because the imcompatibility between and ISet and IDirectory. Collections in .net now is not so powerful as that in jdk. e.g, it’s in lack of ISet. But hibernate uses many Set collections. NHibernate now use Iesi.Collections.ISet to do set-maps.

Someone in the comments of the article said “nHibernate is garbage“. I don’t know if it’s a good choice in real life projects. Since much time must be paid to learn it and many strange problems will occur during the learning and every problem is not so easy to be settled.

12.05.05

Writing an empty directory into a zipfile

Posted in Python at 1:42 pm by Frank

In zipfile module, use ZipFile.writestr(ZipInfo(directoryname+”/”),”") to add an empty directory named directoryname into a zipfile. the Following function can compress all of a directory and save as a file, including its empty directories:
def CompressDirectory(DirName,filename):
    def CompressDir(directory,zf):
        zf.writestr(zipfile.ZipInfo(directory+”/”),”")
        for file in os.listdir(directory):
            fule_file=join(directory,file)
            if os.path.isdir(fule_file)==True:
                CompressDir(join(directory,fule_file), zf)
            else:
                zf.write(join(directory,fule_file))
    zip=zipfile.ZipFile(filename,”w”,zipfile.ZIP_DEFLATED)
    CompressDir(DirName, zip)
    print “done”

12.03.05

Mars Attacks, is better than War of the worlds

Posted in Movies, brief at 11:20 am by Frank

War of the worlds is a shocking movie of this year. It has the famous star Tom Cruise, and the costs is very high. It was very successful. The farther did all he could to do to protect his daughter, which impresses me very much. The war machines made by computers were also very alive. But the movie Mars Attacks, which was published on 1996, is 9 years old now. The computer science at that time was not so adult as nowadays. But this movie is also very successful. It’s a comedy, and it’s very funny. This movie told us Martions are not friendly. They come to earth and kill people everywhere. They just want to destroy the world. They have high technology and even nuclear weapons can’t hurt their saucers. They kill the people, and people fight against them. Many heroes appeared in the war. And the martions’ weakness was finally found. It’s music that kill the enimies in the end, which sounds droll. But this movie is a comedy, every unbelievable situation is acceptable. Everything is possible in the real world, so who can deny the possiblity that music can kill outspace creatures? By the way, the president in this movie is stupid. I’m very glad he was eventually killed.

12.02.05

pyut is not so bad

Posted in Python, brief at 3:30 pm by Frank

Some days ago when I searched “python uml” on google, I met across pyut. But I didn’t try it immediately because I found its newest realease is at 2 years ago. I don’t like to use tools which are not so alive. But when I can’t find any newer tool for days, I decided to have a look at it. When I downloaded and run it, I found that it’s not so bad as I thought to be. In fact, it’s good enough for me. I import a py file and it generates the uml class diagram for me, here is the screenshot:
pyut

12.01.05

IronPython demo video on MSDN

Posted in C Sharp, Python, brief at 3:00 am by Frank

The video has been on the msdn site for 20 days but I just happened saw it today. Jim Hugunin, the author of Jython, is now working in ms on IronPython project. I heard it a few months ago. The demo shows how to use clr classes from inside ironpython, and then gives a few UI examples. There’s also a good news that the full feature of debuging in visual studio are available for ironpython. I hope vs will offer some good support in the python editing. Python is so lack of a powerful IDE like visual studio. But I think when ironpython 1.0 realeases, this will be totally changed, because ironpython will offer nearly full support for CPython. I can write CPython script in visual studio then. I really can’t wait it.

11.30.05

Python thread puzzled me

Posted in Python, brief at 5:33 am by Frank

I wanted to scratch a net-spider like thing. Since python has some convenient html-parsing tools , I decided to use python to try it. I found a powerful hmtl parsing tool named beautiful soup, which really make it much easier to deal with real life html. After read its document, I tried and I extracted the links and content without any problems. But I finally stuck in the python Thread class. It got dead after running for some time. I don’t know how to find out the problems, because I am not familiar enough with python thread apis. After tried some more times, I think I should leave it as it is and turn to C# to achieve this.

Next page