Category Archives: Semantic Apps

2008’s Most Popular Web 2.0 Sites

Today we are living in web 2.0 times more than ever before. PR, press coverage, buzz, evangelism, lobbying, who knows who, who blogs who, who talks about who, mainstream media and beyond – all of those words found in the dictionary of almost every new web site that coins itself as web 2.0, but as the global economy crisis is raising upon us promising to leave us working in a very depressed business environment with little to no liquidation events at all for the next years the real question is: who the real winners in today’s web 2.0 space are based on real people using their web properties since 2005 the web 2.0 term was coined for first time. Since then we have witnessed hundreds of millions of US dollars poured into different web 2.0 sites, applications and technologies and perhaps now is the time to find out which of those web sites have worked things out. We took the time necessary to discover today’s most popular web 2.0 sites based on real traffic and site usage and Not on buzz or size of funding. Sites are ranked based on the estimated traffic figures. After spending years in assessing web 2.0 sites applying tens of different from economical and technological to media criteria in an effort to evaluate them we came up to the conclusion that there is only one criterion worth our attention and it is the real people that use a given site, the traffic, the site usage, etc., based on which the web site can successfully be monetized. Of course, there are a few exceptions from the general rule like sites with extremely valuable technologies and no traffic at all, but as we said, they are exceptions. Ad networks, web networks, hosted networks and group of sites that use consolidated traffic numbers as their own or such ones that rely on the traffic of other sites to boost their own figures (ex.: various ad networks, Quantcast, WordPress etc.) are not taken into consideration and the sites from within those respective networks and groups have been ranked separately. International traffic is of course taken into consideration. Add ons, social network apps and widgets usage is not taken into consideration. Sub-domains as well as international TLDs part of the principal business of the main domain/web site are included. Media sites including such covering the web 2.0 space have also been included. Old buys from the dot com era are not considered and ranked accordingly.

Disclaimer: some data based on which the sites below are ranked may not be complete or correct due to lack of public data available for the traffic of respective sites. Please also note that the data taken into consideration for the ranking may have meanwhile changed and might possibly be no longer the same at the time you are reading the list. Data has been gathered during the months of July, August, September and December 2008.

Today’s most popular Web 2.0 sites based on the traffic they get as measured during the months of July, August and September 2008.

Priority is given to direct traffic measurement methods wherever applicable. Panel data as well as toolbar traffic figures are not taken into cosndieration. Traffic details as taken from Quantcast, Google Analytics*, Nielsen Site Audit, Nielsen NetRatings, comScore Media Metrix, internal server log files*, Compete and Alexa. Press release, public relation and buzz traffic and usage figures as they have appeared in the mainstream and specialized media are given with lower priority unless supported by direct traffic measurement methods.

*wherever applicable

Web Property / Unique visitors per month

  1. WordPress.com ~ 100M
  2. YouTube.com ~ 73M
  3. MySpace.com ~ 72M
  4. Wikipedia.org ~ 69M
  5. Hi5.com ~ 54M
  6. Facebook.com ~ 43M
  7. BlogSpot.com ~ 43M
  8. PhotoBucket.com ~ 34M
  9. MetaCafe.com ~ 30M
  10. Blogger.com ~ 27M
  11. Flickr.com ~ 23M
  12. Scribd.com ~ 23M
  13. Digg.com ~ 21M
  14. Typepad.com ~ 17M
  15. Imeem.com ~ 17M
  16. Snap.com ~ 15.7M
  17. Fotolog.com ~ 15.6M
  18. RockYou.com ~ 15M
  19. Veoh.com ~ 12M
  20. Wikihow.com ~ 12M
  21. Topix.com ~ 11.5M
  22. Blinkx.com ~ 11M
  23. HuffingtonPost.com ~ 11M
  24. Technorati.com ~ 10.6M
  25. Wikia.com ~ 10.8M
  26. Zimbio.com ~ 10.3M
  27. SpyFu.com ~ 10.1M
  28. Heavy.com ~ 9.3M
  29. Yelp.com ~ 8.9M
  30. Slide.com ~ 8.5M
  31. SimplyHired.com ~ 8.5M
  32. Squidoo.com ~ 8.1M
  33. LinkedIn.com ~ 7.5M
  34. HubPages.com ~ 7.2M
  35. Hulu.com ~ 7.1M
  36. AssociatedContent.com ~ 7M
  37. Indeed.com ~ 5.4M
  38. LiveJournal.com ~ 5.2M
  39. Bebo.com ~ 5.1M
  40. Habbo.com ~ 4.9M
  41. Fixya.com ~ 4.5M
  42. RapidShare.com ~ 4.5M
  43. AnswerBag.com ~ 4.4M
  44. Metafilter.com ~ 4.3M
  45. Crackle (Grouper) ~ 4M
  46. Ning.com ~ 3.8M
  47. Breitbart.com ~ 3.8M
  48. BookingBuddy.com ~ 3.7M
  49. Kayak.com ~ 3.6M
  50. Blurtit.com ~ 3.2M
  51. Kaboodle.com ~ 3M
  52. Meebo.com ~ 2.9M
  53. Friendster.com ~ 2.7M
  54. WowWiki.com ~ 2.8M
  55. Truveo.com ~ 2.7M
  56. Trulia.com ~ 2.7M
  57. Twitter.com ~ 2.5M
  58. BoingBoing.net ~ 2.4M
  59. Techcrunch.com ~ 2.2M
  60. Zillow.com ~ 2.2M
  61. MyNewPlace.com ~ 2.2M
  62. Mahalo.com ~ 2.1M
  63. Vox.com ~ 2M
  64. Last.fm ~ 2M
  65. Glam.com ~ 1.9M
  66. Multiply.com ~ 1.9M
  67. Popsugar.com ~ 1.6M
  68. Addthis.com ~ 1.5M
  69. Pandora.com ~ 1.4M
  70. Brightcove.com ~ 1.4M
  71. LinkedWords.com ~ 1.3M
  72. Devshed.com ~ 1.3M
  73. AppleInsider.com ~ 1.3M
  74. Newsvine.com ~ 1.3M
  75. Fark.com ~ 1.2M
  76. BleacherReport.com ~ 1.2M
  77. Mashable.com ~ 1.2M
  78. Zwinky.com ~ 1.2M
  79. Quantcast.com ~ 1.2M
  80. StumbleUpon.com ~ 1.1M
  81. SecondLife.com ~ 1.1M
  82. Magnify.net ~ 1.1M
  83. Uncyclopedia.org ~ 1M
  84. Weblo.com ~ 1M
  85. Del.icio.us ~ 1M
  86. Reddit.com < 1M
  87. Pbwiki.com < 1M
  88. AggregateKnowledge.com < 1M
  89. Eventful.com < 1M
  90. Dizzler.com < 1M
  91. Synthasite.com < 1M
  92. Vimeo.com < 1M
  93. Zibb.com <1M

Web 2.0 sites having less than 1M unique visitors per month even though popular in one way or another are not subject of this list and are not taken into consideration. We know for at least 100 other considered really good web 2.0 sites, apps and technologies of today, but since they are getting less than 1M uniuqes per month they were not able to make our list. However, sites being almost there (850K-950K/mo) and believed to be in position to reach the 1M monthly mark in the next months are also included at the bottom of the list. Those sites are marked with “<“, which means close to 1M, but not yet there. No hard feelings :).

If we’ve omitted one site or another that you know is getting at least 1M uniques per month and you are not seeing it above, drop us a note at info[at]web2innovations.com and we’ll have it included. Please note that the site proposed should be having steady traffic for at least 3 months prior submission to the list above. Sites like, for example: Powerset and Cuil, may not qualify for inclusion due to their temporary traffic leaps caused by buzz they have gotten, a criterion we try to offset. For other corrections and omissions please write at same email address. Requests for corrections of the traffic figures the sites are ranked on can only be justified by providing us with the accurate traffic numbers from reliable direct measurement sources (Quantified at Quantcast, Google Analytics, Nielsen Site Audit, Nielsen NetRatings, comScore Media Metrix, internal server log files, other third party traffic measurement services that use the direct method. No panel data, no Alexa, no Compete etc. will be taken into consideration).

* Note that ranks given to sites at w2i reflect only our own vision for and understanding of the site usage, traffic and unique visitors of the sites being ranked and does not necessarily involve other industry experts’, professionals’, journalists’ and bloggers’ opinions. You acknowledge that any ranking available on web2innovations.com (The Site) is for informational purposes only and should not be construed as investment advice or a recommendation that you, or anyone you advise, should buy, acquire or invest in any of the companies being analyzed and ranked on the Site, or undertake any investment strategy, based on rankings seen on the Site. Moreover, if a company is described or mentioned in our Site, you acknowledge that such description or mention does not constitute a recommendation by web2innovations.com that you engage or otherwise use such web site.

The full list

LinkedWords.com – the consolidated traffic for the entire 2008 is expected to be in the 10 Million range

Launched back in the middle of 2006 LinkedWords has essentially proven over the past years to be very effective vehicle in helping web sites get contextually linked on a content area level so that Internet users and smart robots discover their information in context. Since then the contextual platform has rapidly grown from 30,000 uniques per month back in its early days during 2006 to over 1 million unique visitors per month these past months of 2008.

The successful formula seems to be simple yet very effective: the higher the number of small to mid level sites’ content areas contextually linked in LW’s platform – the higher the number of contextually targeted unique visitors shared among those web sites linked in.

Both Google Analytics and Quantcast measured traffic are now reporting for over 1,000,000 unique visitors per month.

Some interesting facts in regard to the site’s traffic and usage to note are:

1) The 400,000 unique visitors’ mark per month was surpassed for first time in April 2007;

2) For the entire 2007 LW ended up with more than 4,500,000 unique visitors to its contextual platform;

3) For the period of 12 months between Apr 2007 and Apr 2008 LW ended up with more than 7,700,000 unique visitors to its contextual platform;

4) The highest number of monthly visitors so far was encountered during the month of April 2008 when the platform had more than 1,300,000 uniques;

5) 47,564 is the highest number of daily unique visitors ever happened so far, which occurred on April 07, 2008;

This year (2008) was however not all glorious for LinkedWords. During the month of April ’08 their platform has experienced an unprecedented growth in the traffic reaching over 1.3M unique visitors, which resulted in a failure on one of the servers in their cluster causing major downtime. The impacted period was from Friday, April 25 to Friday, May 16, 2008. Millions of unique visitors to LW were said to have been affected. It took them more than 4 months to completely recover both their platform and their reach.

Despite the major downtime that took place during the entire month of May ’08 and had later affected the traffic for a period of several months starting from May and ending on August ’08 considerably slowing it down the anticipated consolidated traffic for the entire 2008 is expected to be in the 10M range, which is double increase from 2007.

About LinkedWords

LinkedWords (LW) is an innovative contextual platform built upon millions of English words and phrases organized into contextual categories, paths, and semantic URLs whose mission is to maximize contextual linking among web sites across the Web.

 

Via EPR Network

Via LinkedWords’ Blog

What is the real reason Automattic bought Glavatar?

As some of you already know w2i (web2innovations.com) is keeping an internal archive of almost all funding and acquisition deals that happened over the past years on web. While we have the ambitions to report on all of them the deals are so many so that we end up only writing about some of the most interesting ones. The same is the case with Automattic when they bought Glavatar some months ago. We kept the news in our archive for quite long time trying to figure out ourselves what is the real motive behind the acquisition of Glavatar and since we came up to no particular synergy and reason we have decided today to simply write about.

First off Automattic is the company behind the popular blog software WordPress. The site is amongst the most popular on web with more than 90M uniques per month. When Matt Mullenweg, announced the deal on the Glavatar’s blog he wrote about so many improvements that Glavatar is going to face with its new owner. Such as scaling things up, they transferred the Rails application and most of the avatar serving to WordPress.com’s infrastructure and servers. Avatar serving was said is already more than three times as fast, and works every time. They’ve also moved Glavatar’s blog from Mephisto to WordPress, of course.

He further said “Basically, we did the bare minimum required to stabilize and accelerate the Gravatar service, focusing a lot on making the gravatars highly available and fast. However our plans are much bigger than that.” Among those are all of the Premium features have gone free, and refunding was offered to anyone who bought them in the last 60 days; gravatar serving moved to a Content Delivery Network (CDN) so not only will they be fast, it’ll be low latency and not slow down a page load; Merging the million avatars WordPress had with the 115,000 or so Glavatar brought on the table after the deal and make them available through the Gravatar API; integrate and improve templates and bring features like multiple avatars over; from WordPress.com, bring the bigger sizes (128px) over and make that available for any Gravatar (Gravatars are only available up to 80px); Adding Microformat support for things like XFN rel=”me” and hCard to all avatar profile pages (that is in particular an interesting move); develop a new API that has cleaner URLs and allows Gravatars to be addressed by things like URL in addition to (or instead of) email addresses and not last rewrite the entire application itself to fit directly into WordPress.com’s grid, for internet-scale performance and reliability.

These days after Yahoo announcing big plans of moving towards web semantics and adopting some of the microformats and hinting LinkedIn for possible better relations with their data set if they adopt them too is a clear signal that web is slowly moving towards semantically linking of data. Automattic is obviously looking forward to that time too with its plans to add microformats like XFN (XHTML Friends Network) and hCard (simple, open, distributed format for representing people, companies, organizations, and places, using a 1:1 representation of vCard (RFC2426) properties and values in semantic HTML or XHTML). An interesting example of contextually and semantically linked web data is LinkedWords and, as you can see, the way we use them to semantically and contextually link words across our texts and connect them to their contextual platform.

So far so good, but nothing from the above indicates what is the reason Automattic bought the site called Glavatar. It is definitely neither because of their user base (only 115K) nor because of the technology, obviously. Employment through acquisition? Not really, Tom Werner, the founder of Glavatar is being said to be a big Ruby guy and taking into consideration the fact Matt seems to be moving towards PHP with Glavatar it seems highly unlikely for Tom to stay with Automattic.

From everything being said publicly it turns out that Automattic has decided to help the small site work better, but no clear benefits are seen for their company from this deal, or at least not to us.

We do believe Matt where he says “our plans are much bigger than that”, but what those plans are? Building a social network upon the avatars and the profile data associated with or perhaps having an online identity service built upon. Or, perhaps, simply building a global avatar service (with in-depth profiles) makes more sense for a company that commands over 100M uniques per month rather than for a tiny web site like Glavatar.

Whatever the case is congratulations to the involved. Terms of the deal were not publicly disclosed.

More about Glavatar

The web is no longer about anonymous content generated by faceless corporations. It is about real people and the real content that they provide.

It is about you.

But as powerful as the web has become, it still lacks the personal touch that comes from a handshake. The vast majority of content you come across on the web will still be near-anonymous even though it may have a name attached. Without knowing the author behind the words, the words cannot be trusted. This is where Gravatar comes in.

Gravatar aims to put a face behind the name. This is the beginning of trust. In the future, Gravatar will be a way to establish trust between producers and consumers on the internet. It will be the next best thing to meeting in person.

Today, an avatar. Tomorrow, Your Identity–Online.

More

http://gravatar.com/
http://site.gravatar.com/site/about
http://automattic.com/
http://blog.gravatar.com/2007/10/18/automattic-gravatar/
http://www.readwriteweb.com/archives/automattic_acquires_gravatar.php
http://www.quantcast.com/p-18-mFEk4J448M
http://microformats.org/wiki/Main_Page
http://rubyisawesome.com/

Some of the Silicon Valley’s top non-Web innovations VCs spent money on

Forbes has assembled a very interesting list of some of the Silicon Valley’s most interesting and coolest innovations beyond the web start-ups. What is being said as a fact is that venture capitalists have poured over $30B into more than 2500 new ventures in 2007 alone. Some of them have to be non-traditional the media says and outlines some of those non-web start ups. The criteria to make the list were companies with unusual technologies or in surprising niches, which recently received additional rounds of venture financing and ranging from gadgets that only the military could love to ones that could wind up in your neighbor’s car.

Insitu

Insitu is a leading high-tech autonomous systems company. They currently produce and sell an ever growing fleet of Unmanned Aircraft Systems that are low-cost, long-endurance, and have low personnel requirements. These UASs provide a no-runway launch, unprecedented stabilized day and night video for ISR, robotic flight control, and a no-nets capture. Insitu began by creating long endurance Unmanned Aircraft to measure atmospheric conditions and do reconnaissance in remote areas for meteorology, daily weather prediction, and climate modeling. Aerosonde was the first aircraft developed by Insitu, noted for completing the first autonomous crossing of the Atlantic Ocean in 1998. From the Aerosonde, Insitu began to develop its Insight UAS platform, that is still being regularly upgraded and deployed today. In 2001, Insitu began working with Boeing to develop ScanEagle, an ISR-focused Unmanned Aircraft System that is currently used by the US Navy, the US Marines, and the Australian Army.

Insitu closed its Series D round of financing led by Battery Ventures’ Roger Lee in December 2007. The company has plans to release a new autonomous aircraft in 2008.

Incesoft

Founded in 2001, Incesoft Technology Co., Ltd. is the world’s leading provider of web robot technology and intelligent interactive information platform. Incesoft is committed long term to the web robot development and research, providing various information and services for users at the same time giving them better interactive experience. At present Incesoft has made great achievements in the field of Chinese artificial intelligent analysis and information management service. Currently Incesoft has the largest Chinese-language web robot platform (www.xiaoi.com). The robots can be used on IM, WEB and Mobile platform, providing services as information, entertainment and E-commerce etc. about working and living. Meanwhile Incesoft also provides customer service robots for companies and governmental departments.

Until now Incesoft has more than 20 million users.

With many-years robot development experience and strong technological power, Incesoft became Microsoft’s global strategic partner in February 2006 and Incesoft Bot Platform became the official robot access platform for Windows Live Messenger. In addition, Incesoft is Tencent QQ (a popular IM tool in China) and Yahoo Messenger’s strategic partner as well.

Draper Fisher Jurvetson and ePlanet Ventures were among the backers who pledged financing in March 2007.

A4Vision

California-based A4Vision has developed a 3D facial imaging and recognition system that works in conjunction with its established fingerprint identification and verification technology. Clients include high-security outfits such as the U.S. Department of Defense and a Swiss bank. Bioscrypt, a company specializing in access control, acquired A4Vision in March 2007. Investors, including In-Q-Tel, the venture wing of the Central Intelligence Agency and Menlo Ventures, must feel secure.

 Ophthonix

Ophthonix, Inc., a San Diego based vision correction company, is changing forever the way we see the world. Customized iZon® High Resolution Lenses allow wearers to see the world in High-Definition—clearer, sharper and more vividly than ever before. The proprietary and patented process is the first ever vision correction technology that addresses the problems associated with the unique variations in each person’s eyes, allowing for customized eyeglass lenses.

The result is a detailed picture, much like your eye’s fingerprint. The iZon lens, custom-built to help reduce glare in nighttime driving, is the result. Kleiner Perkins Caufield & Byers was among investors who put $35.1 million into Ophthonix’s December 2006 Series D round.

Dash Navigation

Dash Navigation has developed the Dash Express, which is an Internet-connected GPS device that offers route choices based on traffic information generated from other Dash Express devices and the Internet.

Superior traffic with the Dash Driver Networkâ„¢:Select your route based on up-to-the-minute traffic data that is automatically and anonymously exchanged via the most reliable source–other Dash devices. The Dash Express gathers traffic information from the Dash Driver Network and combines it with other sources of traffic data to provide you with the most accurate picture of what’s happening on the routes you’re travelling. And, only Dash provides traffic information for freeways and local roads and side streets. Dash Express provides up to three routing options to your destination that are based on flow rather than incident data, and even has the ability to automatically alert you when traffic conditions change and a faster route is available.

Find virtually anything with Yahoo!® Local search:Connect to Yahoo! Local search to find unlimited points of interest—people, places, products and services—based on your specific needs.

Two-way connectivity gives Dash Express the ability to use Yahoo! Local search and other internet search sources to find almost anything anywhere. Unlike other GPS devices that come loaded with a static database of points of interest, Dash gives you access to unlimited points of interest based on your specific needs.

Send2Carâ„¢means no typing required: Its the fastest and easiest way to send an address straight to your device from any computer. Just highlight an address from your Internet browser or Microsoft Outlook and send it directly to the car. You can use Send2Car yourself, or when you’re on the road, have someone else do it for you

MyDash makes it even easier to personalize your Dash Express:MyDash, available at my.dash.net allows you to create and send customized search buttons straight to your device so you always have access to the places you want to go. And you can even take advantage of local knowledge from the Dash network by downloading location lists shared online by other users.

AutoUpdateâ„¢ means a GPS that’s always up to date:Dash Express is the only GPS that automatically and wirelessly updates software and traffic using two-way connectivity. You’ll always have the latest and greatest features as we release them. With Dash you are always up to date!

The company secured $25 million in February 2007 from investors, including Sequoia Capital and Kleiner Perkins Caufield & Byers.

3DV Systems

3DV Systems is a pioneer and world leader in the three-dimensional video imaging industry. Established in 1997 and headquartered in Yokne’am, Israel, the company has developed a unique proprietary technology which enables video cameras to capture the depth dimension of objects in real time, high speed and very high resolution.

The company has developed a unique patented technology which enables cameras to capture the depth dimension of objects in real time, high speed and very high resolution, using low or no CPU resources. 3DV markets, in a fab-less OEM model, a chipset that can be integrated to create systems and solutions for multiple applications as well as the new ZCamTM (previously Z-Sense) family of 3D cameras.

3DV was founded by Dr. Giora Yahav and Dr. Gabi Iddan, two veteran scientists of Rafael, Israel’s leading defense industry. Leveraging their experience and know-how gained through leading development of electro-optics missile technology, they came up with a ground-breaking concept of measuring distance from objects using the Time-of-Flight principle.

Since the successful completion of the development of our first 3D camera directed at the broadcast studio market, the new ZCamTM (previously Z-Sense), in 2000, 3DV was able to dramatically reduce the size and decrease the cost of its technology thus widening the scope of markets and applications and currently reaching consumer markets. The company’s latest prototype camera, the new ZCamTM (previously Z-Sense), is at the size of a typical webcam, and provides home users revolutionary gesture recognition capabilities in addition to real-time background replacement, enabling them to control video games and personal space through intuitive body gestures and immerse themselves with virtual reality. 

Kids may be excited about a new way to play. Adults, by contrast, may appreciate how the technology can be applied to reality: video cameras in their cars. The cameras can detect signs of fatigue, alerting the driver, or help to safely deploy airbags based on the exact location of passengers’ head.

Kleiner Perkins Caufield & Byers and Pitango Venture Capital led the $15 million investment round in December 2006.

Hyperactive Technologies 

The company started in the mind of a founder with two simple questions:

“Why is this burger so bad?”
“What can we bring to the table to make this better?”

In answering those questions – and finding a solution for the problem – HyperActive Technologies looked closely at the processes of quick-service restaurants, and has brought a full array of vision, prediction, and task-management technologies to bear in an industry where competition is fierce and quality is the number one differentiator.

HyperActive Bob is the first and only fully-automated Kitchen Management System that’s improving food quality in QSRs across the country. Here are the driving forces behind our technologies:

Vision: advanced real-time vision technologies monitor customer arrivals constantly and without wavering.

Prediction: Powerful processing tools learn from historical and real-time sales, incorporating the results of this analysis into real-time task management.

Action: easy-to-read touch screen monitors tell cooks precisely what to cook, and when to cook it.

The result: HyperActive Technologies provides “sight and insight” for managers that they’ve never had before, and more: 

HyperActive Bob is the Predictive Kitchen Management System that tells cooks what to cook, and when to cook it, assuring that all of your operations perform as smoothly as your best!

Drive-thru Speed of Service Timer is the first of its kind tool to measure the amount of time drive-thru customers spend in line before they reach the order board!

Walk-in Demand Prediction provides Bob’s keen demand prediction for restaurants that may not have vehicle entries.

HyperActive Technolgies is based in Pittsburg and is a privately held company. Last May, the company purchased QTime solutions, a drive-thru timer to help speed up how Hyperactive develops its recommendations. Private angel investors organized by Spencer Trask Ventures presumably had a quick meeting to decide to put $8.5 million into the firm in 2006.

Basically it is becoming clear that not all VC money goes to sites a la Facebook, yet the US economy is not in its best state today to accommodate and absorb some of these great inventions and innovations.

More

http://www.forbes.com/2008/01/24/midas-tech-novel-tech-08midas-cz_ed_0124novel.html
http://www.insitu.com/
http://www.incesoft.com/English/
http://www.xiaoi.com/
http://www.in-q-tel.org/technology-portfolio/a4vision.html
http://www.bioscrypt.com/
http://www.dash.net/
http://www.izonlens.com/about/
http://www.3dvsystems.com/
http://www.3dvsystems.com/gallery/movies/VirtualGame.mpg
http://www.hyperactivetechnologies.com/ 

ETech, the O’Reilly Emerging Technology Conference is coming

One of the most important technology conferences for the year will be held in March 3-6 in San Diego, California. ETech the O’Reilly Emerging Technology Conference, now in its seventh year, will take a wide-eyed look at the brand new tech that’s tweaking how we are seen as individuals, how we choose to channel and divert our energy and attention, and what influences our perspective on the world around us. How does technology help you perceive things that you never noticed before? How does it help you be found, or draw attention to issues, objects, ideas, and projects that are important, no matter their size or location?

Below is what the 2008 version of ETech, the O’Reilly Emerging Technology Conference will look at. 

Body Hacking. Genomics Hacking. Brain Hacking. Sex Hacking. Food Hacking. iPhone Hacking.
If you can’t open it, you don’t own it. Take over the everyday aspects of your life and take your senses to the next level.

DIY Aerial Drones. DIY Talking Things. DIY Spectrum. DIY Apocalypse Survival.
As technology becomes more accessible you’ll get to do it all on your own. Self-empowerment starts here.

Emerging Tech of India, Cuba, and Africa. International Political Dissidents.
Different environments incubate new ideas and technologies. What these societies bring out will shake up your cultural assumptions and provide a wider world view.

Visualize Data and Crowds. Ambient Data Streaming.
Dynamic systems require new methods of data capture and interaction. Open a window on the methods experts use to interpret and harness collective intelligence.

Good Policy. Energy Policy. Defense Policy. Genetic Policy. Corruption.
Policy inevitably lags behind technology advances. Learn about some areas where it’s catching up, where it’s not, and how these boundaries shape our creativity and freedom.

Alternate Reality Games. Emotions of Games. Sensor Games.
Games provide a platform for experimentation on so many levels. The ones we’ll see engage their players in new and unexpected ways.

ETech 2008 will cover all of these topics and more. We put on stage the speakers and the ideas that help our attendees prepare for and create the future, whatever it might be. Great speakers are going to pull us forward with them to see what technology can do… and sometimes shouldn’t do. From robotics and gaming to defense and geolocation, we’ll explore promising technologies that are just that–still promises–and renew our sense of wonder at the way technology is influencing and altering our everyday lives.

“There’s more good stuff here, more new directions, than we’ve had at ETech in years, which is only to be expected, as the market starts to digest the innovations of Web 2.0 and we are now featuring the next wave of hacker-led surprises.” Read more of Tim O’Reilly’s thoughts on why ETech is our most important conference.

Registered Speakers

Below are listed all confirmed speakers to date.

Dan Albritton (MegaPhone)
Chris Anderson (Wired Magazine)
W. James Au (The Making of Second Life)
Trevor Baca (Jaduka)
Tucker Balch (Georgia Tech)
Kevin Bankston (Electronic Frontier Foundation)
Andrew Bell (Barbarian Group LLC)
Emily Berger (Electronic Frontier Foundation)
Violet Blue (Violet Blue)
Ed Boyden (MIT Media Lab & Dept. of Biological Engineering)
Gary Bradski (Stanford and Willow Garage)
Tom Carden (Stamen Design)
Liam Casey (PCH International)
Elizabeth Churchill (Yahoo! Research)
Cindy Cohn (Electronic Frontier Foundation)
Steve Cousins (Willow Garage)
Bo Cowgill (Google Economics Group)
Mike Culver (Amazon)
Jason Davis (Disney Online)
Regine Debatty (We Make Money Not Art)
Danielle Deibler (Adobe Systems)
Michael Dory (NYU Interactive Telecommunications Program (ITP))
Nathan Eagle (MIT)
Alvaro Fernandez (SharpBrains.com)
Timothy Ferriss (The 4-hour Workweek)
Eric Freeman (Disney Online)
Limor Fried (Adafruit Industries)
Johannes Grenzfurthner (monochrom, and University of Applied Sciences Graz)
Saul Griffith (Makani Power/Squid Labs)
Karl Haberl (Sun Microsystems, Inc.)
Jury Hahn (MegaPhone)
Justin Hall (GameLayers)
Jeff Han (Perceptive Pixel, Inc.)
Timo Hannay (Nature Publishing Group)
Marc Hedlund (Wesabe)
J. C. Herz (Batchtags LLC)
Todd Holloway (Ingenuity Systems)
Pablos Holman (Komposite)
Tom Igoe (Interactive Telecommunications Program, NYU)
Alex Iskold (AdaptiveBlue)
Brian Jepson (O’Reilly Media, Inc.)
Natalie Jeremijenko (NYU)
Jeff Jonas (IBM)
Tim Jones (Electronic Frontier Foundation)
Terry Jones (Fluidinfo)
Damien Katz (IBM – CouchDB)
Nicole Lazzaro (XEODesign, Inc.)
Elan Lee (Fourth Wall Studios)
Jan Lehnardt (Freisatz)
Lawrence Lessig (Creative Commons)
Kati London (area/code)
Kyle Machulis (Nonpolynomial Labs)
Daniel Marcus (Washington University School of Medicine)
Mikel Maron (Mapufacture)
John McCarthy (Stanford University)
Ryan McManus (Barbarian Group LLC)
Roger Meike (Sun Microsystems, Inc.)
Chris Melissinos (Sun Microsystems, Inc.)
Dan Morrill (Google)
Pauline Ng (J. Craig Venter Institute)
Quinn Norton
Peter Norvig (Google, Inc.)
Nicolas Nova (Media and Design Lab)
Danny O’Brien (Electronic Frontier Foundation)
Tim O’Reilly (O’Reilly Media, Inc.)
David Pescovitz (BoingBoing.net, Institute for the Future, MAKE:)
Bre Pettis (I Make Things)
Arshan Poursohi (Sun Microsystems, Inc.)
Marc Powell (Food Hacking)
Jay Ridgeway (Nextumi)
Hugh Rienhoff (MyDaughtersDNA.org)
Jesse Robbins (O’Reilly Radar)
Eric Rodenbeck (Stamen Design)
David Rose (Ambient Devices)
Dan Saffer (Adaptive Path)
Joel Selanikio (DataDyne.org)
Peter Semmelhack (Bug Labs)
Noah Shachtman (Wired Magazine)
Michael Shiloh (OpenMoko)
Kathy Sierra (Creating Passionate Users)
Micah Sifry (Personal Democracy Forum)
Adam Simon (NYU Interactive Telecommunications Program (ITP))
Michael J. Staggs (FireEye, Inc.)
Gavin Starks (d::gen network )
Alex Steffen (Worldchanging)
John Storm (ind)
Stewart Tansley (Microsoft Research)
Paul Torrens (Arizona State University)
Phillip Torrone (Maker Media)
Kentaro Toyama (Microsoft Research India)
Gina Trapani (Lifehacker)
Nate True (Nate True)
Lew Tucker (Radar Networks)
Andrea Vaccari (Senseable City Lab, MIT)
Scott Varland (NYU Interactive Telecommunications Program (ITP))
Merci Victoria Grace (GameLayers)
Mike Walsh (Tomorrow)
Stan Williams (Hewlett-Packard Labs)
Ethan Zuckerman (Global Voices)

Attendee Registration

You can register as an attendee online or by Mail/Fax at the following address:

O’Reilly Media, Inc.
Attn: ETech Registration
1005 Gravenstein Hwy North
Sebastopol, CA 95472
Fax: (707) 829-1342

The conference fees are as follows (through Jan 29 – Mar 2 )
Sessions plus Tutorials $1,690.00
Sessions Only $1,390.00
Tutorials Day Only $595.00

Walk-ins: Standard registration closes March 2, 2008. The onsite registration fee is an additional $100 to the Standard Price above. 

More about ETech

Now in its seventh year, the O’Reilly Emerging Technology Conference hones in on the ideas, projects, and technologies that the alpha geeks are thinking about, hacking on, and inventing right now, creating a space for all participants to connect and be inspired. ETechs past have covered peer-to-peer networks to person-to-person mobile messaging, web services to weblogs, big-screen digital media to small-screen mobile gaming, hardware hacking to content remixing. We’ve hacked, blogged, ripped, remixed, tracked back, and tagged to the nth. Expect much of what you see in early form here to show up in the products and services you’re taking for granted in the not-too-distant future.

ETech balances blue-sky theorizing with practical, real-world information and conversation. Tutorials and breakout sessions will help you inject inspiration into your own projects, while keynotes and hallway conversation will spark enough unconventional thinking to change how you see your world.

More then 1200 technology enthusiasts are expected to attend ETech 2008, including:

  • Technologists
  • CxOs and IT managers
  • Hackers and grassroots developers
  • Researchers and academics
  • Thought leaders
  • Business managers and strategists
  • Artists and fringe technologists
  • Entrepreneurs
  • Business developers and venture capitalists

Representatives from companies and organizations tracking emerging technologies
In the past, ETech has brought together people from such diverse companies, organizations, and projects as: 37signals, Adaptive Path, Amazon.com, Attensa, August Capital, BBC, Boeing, CBS.com, Comcast, Department of Defense, Disney, E*Trade, Fairfax County Library, Fidelity Investments, Fotango, France Telecom, General Motors, Honda, IEEE, Intel, Macromedia, Meetup, Microsoft, Morgan Stanley, Mozilla, National Security Agency, New Statesman, Nielsen Media Research, Nokia, NYU, Oracle, Orbitz, Platial, Salesforce.com, Sony, Starwood Hotels, Symantec, The Motley Fool, UC Santa Barbara Kavli Institute, Zend, and many more.

Some of ETech’s past sponsors and exhibitors include: Adobe, Aggregate Knowledge, Apple, AT&T, Attensa, eBay, Foldera, Google, IBM, Intuit, iNetWord, Laszlo, MapQuest, mFoundry, Root, RSSBus, Salesforce.com, Sxip, TechSmith, Tibco, Windows Live, Yahoo!, and Zimbra.

The conference is expected to gather some of the brightest minds of today’s technology world and Web in particular. 
More

http://conferences.oreilly.com/etech/
http://en.oreilly.com/et2008/public/content/home
http://radar.oreilly.com/archives/2008/01/why-etech-is-oreillys-most-imp.html
 

Inform receives $15 Million investment from Spark Capital

Inform Technologies, a technology solution for established media brands, has received a $15 million investment from Spark Capital, a Boston-based venture fund focused on the intersection of the media, entertainment and technology industries.

The company said in their PR they are going to use the funds to accelerate growth. The company also claims nearly 100 media brands use Inform’s journalistic technology to enhance their sites.  

Founded in 2004, Inform currently works with nearly 100 major media brands to help them ensure that their sites are content destinations and offers editorial-quality features that keep readers engaged on their sites longer – and that increase page views and revenue potential.

Inform’s key offering is a technology solution that acts as an extra editor. It starts with a page of text, and then, with editorial precision, it automatically creates and organizes links to relevant content from the media property’s site, its archives, from affiliate sites and/or anywhere else on the Web. As a result, each page on a site becomes a richer multimedia experience.

Said James Satloff, CEO of Inform, “Media companies face significant challenges online. They need to attract new unique visitors, create an experience that compels those readers to spend more time consuming more pages, and then turn those page views and time on site into revenue. We believe that the Inform solution enables them to do exactly that.”

Longstanding Inform clients include Conde Nast, Crain Communications, IDG, The New York Sun and Washington.Post.Newsweek Interactive. In recent months, 30 additional media properties have engaged Inform – many already running Inform’s technology on their sites.

Inform uses artificial intelligence and proprietary rules and algorithms to scan millions of pages of text and read the way a journalist does – identifying key “entities,” such as people, places, companies and products, and recognizing how they connect, even in subtle and context-specific ways. The software continually teaches itself – in real time – how information is related and automatically updates links and topics as the context changes.

Santo Politi, Founder and Partner at Spark Capital, commented on the following “Established media brands need cost-effective ways to compete with each other and, importantly, with other online presences, such as search. They need depth and richness in their content so they’re true destinations and so readers spend more time on the sites and click through more pages. Inform provides a truly elegant – and so far very successful – solution for that. While allowing the publication to remain in full control of its content and editorial integrity, Inform automatically enriches a site by enabling it to leverage its own content, its archives, archives of affiliates and the web overall. In effect, it enables a publication to expand its editorial capabilities without expanding its staff. We believe the potential for Inform’s growth is substantial.”

 “We’re delighted that our new investor understands how effectively we partner with media companies and how our technology serves their business and editorial objectives. We will use the capital to expand our operations and implement our approach to accelerating our growth.” Said Joseph Einhorn, Co-Founder and CTO of Inform.

We went over Web and researched a bit over the company. It turns out the company has shifted the focus quite often over the past several years. In 2005 the company once said to be around to provide a useful news interface – both blog and non-blog – and to show the interconnectedness of all of the content. Later the same year a major re-launch and re-design struck the company and they have given up on the Ajax based pop-up and have also added vide and audio, which hardly fits into the concept of contextual connection between two content areas/texts based on their semantic textual analysis, unless they have come up to an idea how to read inside and understand image and video files. Google, by contrast, seems to have come up to technology that claims to recognize text in images. In late 2006 the company brought to the market their so called Inform Publisher Services, which was aimed at big web publishers, and was designed to help them increase page views by adding relevant links to other, hopefully related, content in their archives.

The new service was meant to automatically create links in existing articles, which link to a results page containing relevant content from the site as well as from the web, including blogs and audio/video content. Sounds like Sphere and LinkedWords. Basically their latest offering comes closer to what the Inform.com is today.

Some critics on the service have published the following doubts online over a few blogs we have checked out in regard to Inform.

Isn’t this the opposite of semantic web, since they’re sucking in unstructured data? How does their relatedness stuff compare to Sphere and how do their topic pages compare to Topix?

Marshall Kirkpatrick from RWW has put it that way when the question about standards and openness was raised.

“Inform crunches straight text and outputs HTML. I asked whether they publish content with any standards based semantic markup and they said that actual publishing is up to publishers. That’s a shame, I don’t see any reason why Inform wouldn’t participate in the larger semantic web to make its publishers’ content more discoverable. Perhaps when you’ve got 100 live clients and now $15m in the bank, it feels like there’s no reason to open up and play nice with a movement of dreamers having trouble getting other apps out of academia.”

Competition include Sphere, Proximic, Lijit, Adaptiveblue, LinkedWords, somehow NosyJoe, Jiglu, among others. Other, although remote, players in this space include Attendi, Diigo, Twine and Freebase.

More about Inform

Inform Technologies is a new technology solution for established media brands that automatically searches, organizes and links content to provide a rich, compelling experience that attracts and retains readers.

With editorial-quality precision, the technology understands textual content and recognizes subtle differences in meaning. Further, the technology automatically creates links – in articles and on instantly generated topic pages – to relevant content. This deepens a site and engages readers.

Inform’s Essential Technology platform is an artificial intelligence and natural language-based solution that serves almost as an “extra editor” using rules and algorithms to “read” millions of pages of content, identify entities, such as people, places, companies, organizations and products, and topics, to create intelligent links to other closely related information. The technology is also able to recognize subtle differences in meaning and distinguish people, places and things based on local geographies or unique identities.

Inform’s Connected Content Solution and Essential Technology Platform are used by major media brands including CNN.com, WashingtonPost, Newsweek Interactive, Conde Nast, Meredith, IDG and Crain Communications.

Founded in 2004, the company is privately held and has approximately 60 employees, including mathematicians, linguists, programmers, taxonomists, library scientists and other professionals based in New York and India.

About Spark Capital

Spark Capital is a venture capital fund focused on building businesses that transform the distribution, management and monetization of media and content, with experience in identifying and actively building market-leading companies in sectors including infrastructure (Qtera, RiverDelta, Aether Systems, Broadbus and BigBand), networks (College Sports Television, TVONE and XCOM) and services (Akamai and the Platform). Spark Capital has over $600 million under management, and is based in Boston, Massachusetts. Spark has committed to investing $20 million in CNET equity.

More

http://www.inform.com/ 
http://www.inform.com/pr.012308.html
http://www.readwriteweb.com/archives/inform_funding.php
http://www.micropersuasion.com/2005/10/a_new_rss_reade.html
http://www.paidcontent.org/pc/arch/2005_10_16.shtml#051884
http://www.techcrunch.com/tag/inform.com/
http://blog.express-press-release.com/2007/10/19/a-bunch-of-intelligent-and-smart-content-tagging-engines/
http://www.techcrunch.com/2007/10/19/twine-launches-a-smarter-way-to-organize-your-online-life/
http://blog.nosyjoe.com/2007/09/06/nosyjoecom-is-now-searching-for-tags/
http://nextnetnews.blogspot.com/2007/09/is-nosyjoecom-next-clustycom.html
http://kalsey.com/2007/10/jiglu_tags_that_think/
http://mashable.com/2007/10/15/jiglu/
http://www.nytimes.com/2005/10/17/technology/17ecom.html
http://www.techcrunch.com/2005/10/16/informcom-doesnt/
http://www.techcrunch.com/2005/10/24/a-second-look-at-informcom/
http://www.techcrunch.com/2005/12/05/informcom-re-launches-with-major-feature-changes/
http://business2.blogs.com/business2blog/2006/07/scoop_inform_re.html
http://www.techcrunch.com/2006/07/30/informcoms-latest-offering/
http://www.quantcast.com/inform.com
http://bits.blogs.nytimes.com/2007/07/04/when-search-results-include-more-search-results/

Massive second round of funding for Freebase – $42 Million

Freebase, the open and shared database of the world’s knowledge, has raised a whopping amount of money in its Series B round of funding, $42 Million, in a round that included Benchmark Capital and Goldman Sachs. Total funding to date is $57 million.

The investment is considerable, and comes at a time when a number of experts are betting that a more powerful, “semantic” Web is about to emerge, where data about information is much more structured than it is today.

In March 2006, Freebase received $15 million in funding from investors including Benchmark Capital, Millennium Technology Ventures and Omidyar Network.

Freebase, created by Metaweb Technologies, is an open database of the world’s information. It’s built by the community and for the community – free for anyone to query, contribute to, build applications on top of, or integrate into their websites.

Already, Freebase covers millions of topics in hundreds of categories. Drawing from large open data sets like Wikipedia, MusicBrainz, and the SEC archives, it contains structured information on many popular topics, including movies, music, people and locations – all reconciled and freely available via an open API. This information is supplemented by the efforts of a passionate global community of users who are working together to add structured information on everything from philosophy to European railway stations to the chemical properties of common food ingredients.

By structuring the world’s data in this manner, the Freebase community is creating a global resource that will one day allow people and machines everywhere to access information far more easily and quickly than they can today.

Freebase  aims to “open up the silos of data and the connections between them”, according to founder Danny Hillis at the Web 2.0 Summit. Freebase is a database that has all kinds of data in it and an API. Because it’s an open database, anyone can enter new data in Freebase. An example page in the Freebase db looks pretty similar to a Wikipedia page. When you enter new data, the app can make suggestions about content. The topics in Freebase are organized by type, and you can connect pages with links, semantic tagging. So in summary, Freebase is all about shared data and what you can do with it.

Here’s a video tour of how does Freebase work. Freebase categorizes knowledge according to thousands of “types” of information, such as film, director or city. Those are the highest order of categorization. Then underneath those types you have “topics,” which are individual examples of the types — such as Annie Hall and Woody Allen. It boasts two million topics to date. This lets Freebase represent information in a structured way, to support queries from web developers wanting to build applications around them. It also solicits people to contribute their knowledge to the database, governed by a community of editors. It offers a Creative Commons license so that it can be used to power applications, on an open API.

This is one of the biggest Series B rounds for the past 12 months. And probably what Google tries to do with its Knol to Wikipedia is the same what Freebase tries to achieve too – replicate and commercialize the huge success of the non-profit Wikipedia.

Other semantic applications and projects include Powerset, Twine, AdaptiveBlue, Hakia, Talis, LinkedWords, NosyJoe, TrueKnowledge, among others.

Peter Rip, an investor in Twine has quickly reacted on the comparison between the two Freebase and Twine the VentureBeat’s Matt Marshall made.

As an investor in Twine, allow me correct you about Twine and Metaweb’s positioning. You correctly point out that Metaweb is building a database about concepts and things on the Web. Twine is not. Twine is really more of an application than a database. It is a way for persons to share information about their interests. So they are complementary, not competitive.

What’s most important is that Twine will be able to use all the structure in something like Metaweb (and other content sources) to enrich the user’s ability to track and manage information. Think of Metaweb as a content repository and Twine as as the app that uses content for specific purposes.

Twine is still in closed beta. So the confusion is understandable, especially with all the hype surrounding the category.

Nova Spivack, the founder of Twine has also commented on.

Freebase and Twine are not competitive. That should be corrected in the above article. In fact our products are very different and have different audiences. Twine is for helping people and groups share knowledge around their interests and activities. It is for managing personal and group knowledge, and ultimately for building smarter communities of interest and smarter teams.

Metaweb, by contrast, is a data source that Twine can use, but is not focused on individuals or on groups. Rather Metaweb is building a single public information database, that is similar to the Wikipedia in some respects. This is a major difference in focus and functionality. To use an analogy, Twine is more like a semantic Facebook, and Metaweb is more like a semantic Wikipedia.

Freebase is in alpha.

Freebase.com was the first Semantic App being featured by Web2Innovations in its series of planned publications where we will try to discover, highlight and feature the next generation of web-based semantic applications, engines, platforms, mash-ups, machines, products, services, mixtures, parsers, and approaches and far beyond.

The purpose of these publications is to discover and showcase today’s Semantic Web Apps and projects. We’re not going to rank them, because there is no way to rank these apps at this time – many are still in alpha and private beta.
More

http://www.metaweb.com/about/
http://freebase.com
http://roblog.freebase.com
http://venturebeat.com/2008/01/14/shared-database-metaweb-gets-42m-boost/
http://www.techcrunch.com/2008/01/16/freebase-takes-42-million/
http://www.dmwmedia.com/news/2008/01/15/freebase-developer-metaweb-technologies-gets-$42.4-million
http://www.crunchbase.com/company/freebase
http://www.readwriteweb.com/archives/10_semantic_apps_to_watch.php
http://en.wikipedia.org/wiki/Danny_Hillis
http://www.metaweb.com
http://en.wikipedia.org/wiki/Metaweb_Technologies
https://web2innovations.com/money/2007/11/30/freebase-open-shared-database-of-the-worlds-knowledge/
http://mashable.com/2007/07/17/freebase/
http://squio.nl/blog/2007/04/02/freebase-life-the-universe-and-everything/

Behavioral Targeting is Busted; But Marketers are barking up the wrong tree!

Behavioral Targeting (BT) has been around since the first dotcom days. It got really hot again in late 2007 thanks to a few big promoters like Facebook. But what is it and does it really work as it sounds?

BT tracks a web visitor’s browser-click-streams, typically in the last six visits, to predict what he or she may want in the future, and target ads, content or products based on those “personalized” past behaviors. The hope is that BT will show the right ad or product to the right user who is most susceptible to it.  This sounds ideal to advertisers, but, put yourself in the shoes of a user and two huge problems leap out: privacy and quality.

The Privacy Issue
With such a glut of products and information online, the motivation behind behavioral targeting makes sense – it seems to be a good thing for Yahoo to get me a more relevant ad because they happen to know I checked out a Prius in my local dealership. For consumers, however, there is an obvious psychological aversion to behavioral targeting, as they feel they are being personally tracked and watched.

In this age of identity theft and mounting concerns over privacy in general, a practice that proactively profiles a user, perhaps over the scope of many websites and over a period of several months, will sound alarms even among the least conservative of us. And while BT advocates will defend their practice of storing only anonymous data —
which is the proper thing to do — knowing that your likes, dislikes, shopping history, and viewing tendencies are being tracked and possibly shared or sold to advertisers is disconcerting at the least.

In addition, with so much information about us on the web, an anonymous individual on one site can quickly become a known/named user on another site once BT starts to compare and contrast user behaviors across multiple sites. So our private information can spread out very quickly without us even knowing it.

Not surprisingly, many advocacy groups are very concerned about the issues surrounding this type of targeting.  Privacy groups have recently proposed a “Do not Track” list to limit behavioral profiling techniques similar to “Do Not Call” lists that keep pesky telemarketers away. 

Privacy concerns seem to be enough to limit the impact of BT. But there is more.

The Bigger Pitfalls of Behavioral Targeting
Beyond privacy concerns, there are accuracy and quality issues with BT that all online marketers and e-commerce managers may not be aware of.  Traditional BT struggles precisely because it tries to discern what I want now based on my past behaviors. Consider the impact of focusing on historical interests instead of current intent – if I bought a gag gift for a bachelor party, I certainly do not want to be bombarded by ads for similar “products” that might cause embarrassment or make me the butt of the joke around the office.

Another way to think of this problem is the idea of roles or personalization.  Humans have far too many roles in life – or what personalization systems might call profiles – to possibly predict what a given user wants on that day.  A woman shopping for baby clothes, a tie for her husband, and a gift for her sister may appear schizophrenic because she is acting in three different roles – mother, wife, and sister.   What do you show her next?  Tossing ads at her about strollers is not going to appeal to her now that she’s shopping for a new cocktail dress for herself.

This is the pitfall of profiles.  In a given month, an individual will have thousands of roles. Knowing my past is not necessarily a better way to predict my future. In fact, this phenomenon has been known by psychologists and other scientists for years – humans are animals of context and situations, much less so of our historical profiles or roles.

Let’s look at Facebook’s behavioral targeting practices. Alex Iskold recently wrote a good blog in ReadWriteWeb about a little myth regarding how behavioral targeting is going to help Facebook justify their $15 billion valuation. I like Alex’s summation of the myth: “because Facebook knows everything about us, it will always be able to serve perfect ads.” But the reality is very different.

Facebook does not really know much about us, especially anything about our true intent at any given moment when we are on the network.  Their user profiles are historical artifacts and not tied to current intent. In addition, the behaviors that users exhibit on Facebook are about connecting with one another – not about reading, researching, and buying like the rest of the web. And finally, when users connect they’re only acting in one of their infinite roles.

In the end, the ads we get served on Facebook today are the direct result of the lack of understanding of its users.  Those in the ad industry liken these to “Run of Network” ads which are not targeted and are simply designed to get a fraction of a percent click-through.  Unsurprisingly, most ads are about dating.

Enter Intent-based Targeting
An alternative that solves the issues with both privacy and effectiveness is one centered on understanding the user’s intent, instead of their clickpath or profile, and pairing that with specific content, product, and advertising recommendations. This approach relies exclusively on the collective wisdom of like-minded peers who have demonstrated interests or engagement with similar content and context.

The concept of profiles is completely removed in this case, and instead by understanding the user’s expressed or implied intent that user will see the content that is appropriate to their current mindset.

This is the next evolution in user targeting that gets beyond clicks and analytics, and instead rests on a proven foundation of modern social science theory.  The approach is conceptually simple and mimics how we learn and act in everyday life – making choices based on what others who are in the same current mindset as us have done.

Since humans change roles rapidly, intent-based models allow content recommendations, ads, and even search results to change instantly as users act in a new or different role.  Further, because historical actions and profiles are not needed, 100 percent of the new visitors coming to a website can be targeted with precise content before the first click.

Win/Win
Website users care about privacy and usability on the web.  Targeting visitors based on their intent, which is validated by the collective wisdom of those before them with the same intent, is a natural way for visitors to interact with your website – it’s the way humans have been programmed to work.  Most importantly it kills two birds with one stone: users get useful, accurate recommendations and ads while still avoiding the whole privacy mess. 

~~~~~~~~~

Jack is a founder and CEO of Baynote, Inc., a provider of Intent-driven Recommendation and Social Search technology for websites. Previously, Jack served as SVP & founding CTO of Interwoven Inc. with responsibilities across engineering, products, marketing, corporate vision and strategy. Prior to Interwoven, he was a founder and CEO of V-max America. Jack also led operating systems and applications development at SGI, Sun Microsystems, Stratus and NASA. He is a frequent major conference speaker and has appeared on television programs in several countries. He is a contributing author in “XML Handbook, the 4th Edition”, “Online! The Book”, “Content Management Bible”, and writes regularly about key technology issues and trends. He can be contacted at jack@baynote.com.

Hakia takes $5M more, totals $16M

In a new round of funding Hakia, the natural language processing search engine has raised additional $5M. The money came from a previous investor, some of which are Noble Grossart Investments, Alexandra Investment Management, Prokom Investments, KVK, and several angel investors. With plans of fully launching some time next year, Hakia has been working towards improving its relevancy and adding some social features like “Meet the Others” to their site. Hakia is known to have raised $11 million in its first round of funding in late 2006 from a panoply of investors scattered across the globe who were attracted by the company’s semantic search technology. As far as we know, the company’s total funding is now $16M.

We think that from all alternative search engines, excluding Ask.com and Clusty.com, Hakia seems to be one of the most trafficked engines with almost 1M unique visitors as we last checked the site’s publicly available stats. If it is us to rank the most popular search engines I would put them the following way: Google, Yahoo, Ask.com, MSN, Naver, some other regional leaders, Clusty and perhaps somewhere there is hakia going.

On the other hand and according to Quantcast, Hakia is basically not so popular site and is reaching less than 150,000 unique visitors per month. Compete is reporting much better numbers – slightly below 1 million uniques per month. Considering the fact the search engine is still in its beta stage these numbers are more than great. However, analyzing further the traffic curve on both measuring sites above it appears that the traffic hakia gets is sort of campaign based, in other words generated due to advertising, promotion or PR activity and is not permanent organic traffic due to heavy usage of the site.

In related news a few days ago Google’s head of research Peter Norvig said that we should not expect to see natural-language search at Google anytime soon.

In a Q&A with Technology Review, he says:

We don’t think it’s a big advance to be able to type something as a question as opposed to keywords. Typing “What is the capital of France?” won’t get you better results than typing “capital of France.”

Yet he does acknowledge that there is some value in the technology:

We think (Google) what’s important about natural language is the mapping of words onto the concepts that users are looking for. To give some examples, “New York” is different from “York,” but “Vegas” is the same as “Las Vegas,” and “Jersey” may or may not be the same as “New Jersey.” That’s a natural-language aspect that we’re focusing on. Most of what we do is at the word and phrase level; we’re not concentrating on the sentence. We think it’s important to get the right results rather than change the interface.

In other words, a natural-language approach is useful on the back-end to create better results, but it does not present a better user experience. Most people are too lazy to type in more than one or two words into a search box anyway. The folks at both Google and Yahoo know that is true for the majority of searchers. The natural-language search startups are going to find out about that the hard way.

Founded in 2004, hakia is a privately held company with headquarters in downtown Manhattan. hakia operates globally with teams in the United States, Turkey, England, Germany, and Poland.

The Founder of hakia is Dr. Berkan who is a nuclear scientist with a specialization in artificial intelligence and fuzzy logic. He is the author of several articles in this area, including the book Fuzzy Systems Design Principles published by IEEE in 1997. Before launching hakia, Dr. Berkan worked for the U.S. Government for a decade with emphasis on information handling, criticality safety and safeguards. He holds a Ph.D. in Nuclear Engineering from the University of Tennessee, and B.S. in Physics from Hacettepe University, Turkey.

More

[ http://venturebeat.com/2007/12/12/hakia-raising-5m-for-semantic-search/ ]
[ http://mashable.com/2007/12/12/hakia-funded/ ]
[ http://www.hakia.com/ ]
[ http://blog.hakia.com/ ]
[ http://www.hakia.com/about.html ]
[ http://www.readwriteweb.com/archives/hakia_takes_on_google_semantic_search.php ]
[ http://www.readwriteweb.com/archives/hakia_meaning-based_search.php ]
[ http://siteanalytics.compete.com/hakia.com/?metric=uv ]
[ http://www.internetoutsider.com/2007/07/the-big-problem.html ]
[ http://www.quantcast.com/search/hakia.com ]
[ http://www.redherring.com/Home/19789 ]
[ http://web2innovations.com/hakia.com.php ]
[ http://www.pandia.com/sew/507-hakia.html ]
[ http://www.searchenginejournal.com/hakias-semantic-search-the-answer-to-poor-keyword-based-relevancy/5246/ ]
[ http://arstechnica.com/articles/culture/hakia-semantic-search-set-to-music.ars ]
[ http://www.news.com/8301-10784_3-9800141-7.html ]
[ http://searchforbettersearch.com/ ]
[ https://web2innovations.com/money/2007/12/01/is-google-trying-to-become-a-social-search-engine/ ]
[ http://www.web2summit.com/cs/web2006/view/e_spkr/3008 ]
[ http://www.techcrunch.com/2007/12/18/googles-norvig-is-down-on-natural-language-search/ ]

Hakia takes on major search engines backed up by a small army of international investors

In our planned series of publications about the Semantic Web and its Apps today Hakia is our 3rd featured company.

Hakia.com, just like Freebase and Powerset is also heavily relying on Semantic technologies to produce and deliver hopefully better and meaningful results to its users.

Hakia is building the Web’s new “meaning-based” (semantic) search engine with the sole purpose of improving search relevancy and interactivity, pushing the current boundaries of Web search. The benefits to the end user are search efficiency, richness of information, and time savings. The basic promise is to bring search results by meaning match – similar to the human brain’s cognitive skills – rather than by the mere occurrence (or popularity) of search terms. Hakia’s new technology is a radical departure from the conventional indexing approach, because indexing has severe limitations to handle full-scale semantic search.

Hakia’s capabilities will appeal to all Web searchers – especially those engaged in research on knowledge intensive subjects, such as medicine, law, finance, science, and literature. The mission of hakia is the commitment to search for better search.

Here are the technological differences of hakia in comparison to conventional search engines.

QDEX Infrastructure

  • hakia’s designers broke from decades-old indexing method and built a more advanced system called QDEX (stands for Query Detection and Extraction) to enable semantic analysis of Web pages, and “meaning-based” search. 
  • QDEX analyzes each Web page much more intensely, dissecting it to its knowledge bits, then storing them as gateways to all possible queries one can ask.
  • The information density in the QDEX system is significantly higher than that of a typical index table, which is a basic requirement for undertaking full semantic analysis.
  • The QDEX data resides on a distributed network of fast servers using a mosaic-like data storage structure.
  • QDEX has superior scalability properties because data segments are independent of each other.

SemanticRank Algorithm

  • SemanticRank algorithm of hakia is comprised of innovative solutions from the disciplines of Ontological Semantics, Fuzzy Logic, Computational Linguistics, and Mathematics. 
  • Designed for the expressed purpose of higher relevancy.
  • Sets the stage for search based on meaning of content rather than the mere presence or popularity of keywords.
  • Deploys a layer of on-the-fly analysis with superb scalability properties.
  • Takes into account the credibility of sources among equally meaningful results.
  • Evolves its capacity of understanding text from BETA operation onward.

In our tests we’ve asked Hakia three English-language based questions:

Why did the stock market crash? [ http://www.hakia.com/search.aspx?q=why+did+the+stock+market+crash%3F ]
Where do I get good bagels in Brooklyn? [ http://www.hakia.com/search.aspx?q=where+can+i+find+good+bagels+in+brooklyn ]
Who invented the Internet? [ http://www.hakia.com/search.aspx?q=who+invented+the+internet ]

It basically returned intelligent results for all. For example, Hakia understood that, when we asked “why,” I would be interested in results with the words “reason for”–and produced some relevant ones. 

Hakia  is one of the few promising Alternative Search Engines as being closely watched by Charles Knight at his blog AltSearchEngines.com, with a focus on natural language processing methods to try and deliver ‘meaningful’ search results. Hakia attempts to analyze the concept of a search query, in particular by doing sentence analysis. Most other major search engines, including Google, analyze keywords. The company believes that the future of search engines will go beyond keyword analysis – search engines will talk back to you and in effect become your search assistant. One point worth noting here is that, currently, Hakia still has some human post-editing going on – so it isn’t 100% computer powered at this point and is close to human-powered search engine or combination of the two.

They hope to provide better search results with complex queries than Google currently offers, but they have a long way to catch up, considering Google’s vast lead in the search market, sophisticated technology, and rich coffers. Hakia’s semantic search technology aims to understand the meaning of search queries to improve the relevancy of the search results.

Instead of relying on indexing the web or on the popularity of particular web pages, as many search engines do, hakia tries to match the meaning of the search terms to mimic the cognitive processes of the human brain.

“We’re mainly focusing on the relevancy problem in the whole search experience,” said Dr. Berkan in an interview Friday. “You enter a question and get better relevancy and better results.”

Dr. Berkan contends that search engines that use indexing and popularity algorithms are not as reliable with combinations of four or more words since there are not enough statistics available on which to base the most relevant results.

“What we are doing is an ultimate approach, doing meaning-based searches so we understand the query and the text, and make an association between them by semantic analysis,” he said.

Analyzing whole sentences instead of keywords would indefinitely increase the cost to the company to index and process the world’s information. The case is pretty much the same with Powerset where they are also doing deep contextual analysis on every sentence on every web page and is publicly known fact they have higher cost for indexing and analyzing than Google. Taking into consideration that Google is having more than 450,000 servers in several major data centers and hakia’s indexing and storage costs might be even higher the approach they are taking might cost their investors a fortune to keep the company alive.

It would be interesting enough to find out if hakia is also building their architecture upon the Hbase/Hadoop environment just like Powerset does. 

In the context of indexing and storing the world’s information it worth mentioning that there is yet another start-up search engine called Cuill that’s claiming to have invented a technology for cheaper and faster indexation than Google’s. Cuill claims that their indexing costs will be 1/10th of Google’s, based on new search architectures and relevance methods.

Speaking also for semantic textual analysis and presentation of meaningful results NosyJoe.com is a great example of both, yet it seems it is not going to index and store the world’s information and then apply the contextual analysis to, but rather than is focusing on what is quality and important for the people participating in their social search engine. 

A few months ago Hakia launched a new social feature called “Meet Others” It will give you the option, from a search results page, to jump to a page on the service where everyone who searches for the topic can communicate.

For some idealized types of searching, it could be great. For example, suppose you were searching for information on a medical condition. Meet Others could connect you with other people looking for info about the condition, making an ad-hoc support group. On the Meet Others page, you’re able to add comments, or connect directly with the people on the page via anonymous e-mail or by Skype or instant messaging.

On the other hand implementing social recommendations and relying on social elements like Hakia’s Meet the Others feature one needs to have huge traffic to turn that interesting social feature into an effective information discovery tool. For example Google with its more than 500 million unique searchers per month can easily beat such social attempts undergone by the smaller players if they only decide to employ, in one way or another, their users to find, determine the relevancy, share and recommend results others also search for. Such attempts by Google are already in place as one can read over here: Is Google trying to become a social search engine.

Reach

According to Quantcast, Hakia is basically not so popular site and is reaching less than 150,000 unique visitors per month. Compete is reporting much better numbers – slightly below 1 million uniques per month. Considering the fact the search engine is still in its beta stage these numbers are more than great. Analyzing further the traffic curve on both measuring sites above it appears that the traffic hakia gets is sort of campaign based, in other words generated due to advertising, promotion or PR activity and is not permanent organic traffic due to heavy usage of the site.

The People

Founded in 2004, hakia is a privately held company with headquarters in downtown Manhattan. hakia operates globally with teams in the United States, Turkey, England, Germany, and Poland.

The Founder of hakia is Dr. Berkan who is a nuclear scientist with a specialization in artificial intelligence and fuzzy logic. He is the author of several articles in this area, including the book Fuzzy Systems Design Principles published by IEEE in 1997. Before launching hakia, Dr. Berkan worked for the U.S. Government for a decade with emphasis on information handling, criticality safety and safeguards. He holds a Ph.D. in Nuclear Engineering from the University of Tennessee, and B.S. in Physics from Hacettepe University, Turkey. He has been developing the company’s semantic search technology with help from Professor Victor Raskin of PurdueUniversity, who specializes in computational linguistics and ontological semantics, and is the company’s chief scientific advisor.

Dr. Berkan resisted VC firms because he worried they would demand too much control and push development too fast to get the technology to the product phase so they could earn back their investment.

When he met Dr. Raskin, he discovered they had similar ideas about search and semantic analysis, and by 2004 they had laid out their plans.

They currently have 20 programmers working on building the system in New York, and another 20 to 30 contractors working remotely from different locations around the world, including Turkey, Armenia, Russia, Germany, and Poland.
The programmers are developing the search engine so it can better handle complex queries and maybe surpass some of its larger competitors.

Management

  • Dr. Riza C. Berkan, Chief Executive Officer
  • Melek Pulatkonak, Chief Operating Officer
  • Tim McGuinness, Vice President, Search
  • Stacy Schinder, Director of Business Intelligence
  • Dr. Christian F. Hempelmann, Chief Scientific Officer
  • John Grzymala, Chief Financial Officer

Board of Directors

  • Dr. Pentti Kouri, Chairman
  •  Dr. Riza C. Berkan, CEO
  • John Grzymala
  • Anuj Mathur, Alexandra Global Fund
  • Bill Bradley, former U.S. Senator
  • Murat Vargi, KVK
  • Ryszard Krauze, Prokom Investments

Advisory Board

  • Prof. Victor Raskin (Purdue University)
  • Prof. Yorick Wilks, (Sheffield University, UK)
  • Mark Hughes

Investors

Hakia is known to have raised $11 million in its first round of funding from a panoply of investors scattered across the globe who were attracted by the company’s semantic search technology.

The New York-based company said it decided to snub the usual players in the venture capital community lining Silicon Valley’s Sand Hill Road and opted for its international connections instead, including financial firms, angel investors, and a telecommunications company.

Poland

Among them were Poland’s Prokom Investments, an investment group active in the oil, real estate, IT, financial, and biotech sectors.

Turkey

Another investor, Turkey’s KVK, distributes mobile telecom services and products in Turkey. Also from Turkey, angel investor Murat Vargi pitched in some funding. He is one of the founding shareholders in Turkcell, a mobile operator and the only Turkish company listed on the New York Stock Exchange.

Malaysia

In Malaysia, hakia secured funding from angel investor Lu Pat Ng, who represented his family, which has substantial investments in companies worldwide.
From Finland, hakia turned to Dr. Pentti Kouri, an economist and VC who was a member of the Nokia board in the 1980s. He has taught at Stanford, Yale, New York University, and HelsinkiUniversity, and worked as an economist at the International Monetary Fund. He is currently based in New York.

United States

In the United States, hakia received funding from Alexandra Investment Management, an investment advisory firm that manages a global hedge fund. Also from the U.S., former Senator and New York Knicks basketball player Bill Bradley has joined the company’s board, along with Dr. Kouri, Mr. Vargi, Anuj Mathur of Alexandra Investment Management, and hakia CEO Riza Berkan.

Hakia was on of the first alternative search engine to make the home page of web 2.0 Innovations in the past year… http://web2innovations.com/hakia.com.php

Hakia.com is the 3rd Semantic App being featured by Web2Innovations in its series of planned publications [  ] where we will try to discover, highlight and feature the next generation of web-based semantic applications, engines, platforms, mash-ups, machines, products, services, mixtures, parsers, and approaches and far beyond.

The purpose of these publications is to discover and showcase today’s Semantic Web Apps and projects. We’re not going to rank them, because there is no way to rank these apps at this time – many are still in alpha and private beta.

Via

[ http://www.hakia.com/ ]
[ http://blog.hakia.com/ ]
[ http://www.hakia.com/about.html ]
[ http://www.readwriteweb.com/archives/hakia_takes_on_google_semantic_search.php ]
[ http://www.readwriteweb.com/archives/hakia_meaning-based_search.php ]
[ http://siteanalytics.compete.com/hakia.com/?metric=uv ]
[ http://www.internetoutsider.com/2007/07/the-big-problem.html ]
[ http://www.quantcast.com/search/hakia.com ]
[ http://www.redherring.com/Home/19789 ]
[ http://web2innovations.com/hakia.com.php ]
[ http://www.pandia.com/sew/507-hakia.html ]
[ http://www.searchenginejournal.com/hakias-semantic-search-the-answer-to-poor-keyword-based-relevancy/5246/ ]
[ http://arstechnica.com/articles/culture/hakia-semantic-search-set-to-music.ars ]
[ http://www.news.com/8301-10784_3-9800141-7.html ]
[ http://searchforbettersearch.com/ ]
[ https://web2innovations.com/money/2007/12/01/is-google-trying-to-become-a-social-search-engine/ ]
[ http://www.web2summit.com/cs/web2006/view/e_spkr/3008 ]
 

Powerset – the natural language processing search engine empowered by Hbase in Hadoop

In our planned series of publications about the Semantic Web and its apps today Powerset is going to be our second company, after Freebase, to be featured. 

Powerset is a Silicon Valley based company building a transformative consumer search engine based on natural language processing. Their unique innovations in search are rooted in breakthrough technologies that take advantage of the structure and nuances of natural language. Using these advanced techniques, Powerset is building a large-scale search engine that breaks the confines of keyword search. By making search more natural and intuitive, Powerset is fundamentally changing how we search the web, and delivering higher quality results.

Powerset’s search engine is currently under development and is closed for the general public. You can always keep an eye on them in order to learn more information about their technology and approach.

Despite all the press attention Powerset is gaining there are too few details publicly available for the search engine. In fact Powerset is lately one of the most buzzed companies in the Silicon Valley, for good or bad.

Power set is a term from the mathematics and means a set S, the power set (or powerset) of S, written P(S) P(S), or 2S, is the set of all subsets of S. In axiomatic set theory (as developed e.g. in the ZFC axioms), the existence of the power set of any set is postulated by the axiom of power set. Any subset F of P(S), is called a family of sets over S.

From the latest information publicly available for Powerset we learn that, just like some other start-up search engines, they are also using Hbase in Hadoop environment to process vast amounts of data.

It also appears that Powerset relies on a number of proprietary technologies such as the XLE, licensed from PARC, ranking algorithms, and the ever-important onomasticon (a list of proper nouns naming persons or places).

  

For any other component, Powerset tries to use open source software whenever available. One of the unsung heroes that form the foundation for all of these components is the ability to process insane amounts of data. This is especially true for a Natural Language search engine. A typical keyword search engine will gather hundreds of terabytes of raw data to index the Web. Then, that raw data is analyzed to create a similar amount of secondary data, which is used to rank search results. Since Powerset’s technology creates a massive amount of secondary data through its deep language analysis, Powerset will be generating far more data than a typical search engine, eventually ranging up to petabytes of data.
Powerset has already benefited greatly from the use of Hadoop: their index build process is entirely based on a Hadoop cluster running the Hadoop Distributed File System (HDFS) and makes use of Hadoop’s map/reduce features.

In fact Google also uses a number of well-known components to fulfill their enormous data processing needs: a distributed file system (GFS) ( http://labs.google.com/papers/gfs.html ), Map/Reduce ( http://labs.google.com/papers/mapreduce.html ), and BigTable ( http://labs.google.com/papers/bigtable.html ).

Hbase is actually the open-source equivalent of Google’s Bigtable, which, as far as we understand the matter is a great technological achievement of the guys behind Powerset. Both JimKellerman and Michael Stack are from Powerset and are the initial contributors of Hbase.

Hbase could be the panacea for Powerset in scaling their index up to Google’s level, yet coping Google’s approach is perhaps not the right direction for a small technological company like Powerset. We wonder if Cuill, yet another start-up search engine that’s claiming to have invented a technology for cheaper and faster indexation than Google’s, has built their architecture upon the Hbase/Hadoop environment.  Cuill claims that their indexing costs will be 1/10th of Google’s, based on new search architectures and relevance methods. If it is true what would the Powerset costs then be considering the fact that Powerset is probably having higher indexing costs even compared to Google, because it does a deep contextual analysis on every sentence on every web page? Taking into consideration that Google is having more than 450,000 servers in several major data centers and Powerset’s indexing and storage costs might be even higher the approach Powerset is taking might be costly business for their investors.

Unless Hbase and Hadoop are the secret answer Powerset relies on to significantly reduce the costs. 

Hadoop is an interesting software platform that lets one easily write and run applications that process vast amounts of data.

Here’s what makes Hadoop especially useful:

  • Scalable: Hadoop can reliably store and process petabytes.
  • Economical: It distributes the data and processing across clusters of commonly available computers. These clusters can number into the thousands of nodes.
  • Efficient: By distributing the data, Hadoop can process it in parallel on the nodes where the data is located. This makes it extremely rapid.
  • Reliable: Hadoop automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures.

Hadoop implements MapReduce, using the Hadoop Distributed File System (HDFS) (see figure below.) MapReduce divides applications into many small blocks of work. HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster. MapReduce can then process the data where it is located.
Hadoop has been demonstrated on clusters with 2000 nodes. The current design target is 10,000 node clusters.
Hadoop is a Lucene sub-project that contains the distributed computing platform that was formerly a part of Nutch.

Hbase’s background

Google’s  Bigtable, a distributed storage system for structured data, is a very effective mechanism for storing very large amounts of data in a distributed environment.  Just as Bigtable leverages the distributed data storage provided by the Google File System, Hbase will provide Bigtable-like capabilities on top of Hadoop. Data is organized into tables, rows and columns, but a query language like SQL is not supported. Instead, an Iterator-like interface is available for scanning through a row range (and of course there is an ability to retrieve a column value for a specific key). Any particular column may have multiple values for the same row key. A secondary key can be provided to select a particular value or an Iterator can be set up to scan through the key-value pairs for that column given a specific row key.

Reach

According to Quantcast, Powerset is basically not popular site and is reaching less than 20,000 unique visitors per month, around 10,000 Americans. Compete is reporting the same – slightly more than 20,000 uniques per month. Considering the fact the search engine is still in its alpha stage these numbers are not that bad.

The People

Powerset has assembled a star team of talented engineers, researchers, product innovators and entrepreneurs to realize an ambitious vision for the future of search. Our team comprises industry leaders from a diverse set of companies including: Altavista, Apple, Ask.com, BBN, Digital, IDEO, IBM, Microsoft, NASA, PARC, Promptu, SRI, Tellme, Whizbang! Labs, and Yahoo!.

Founders of Powerset are Barney Pell and Lorenzo Thione and the company is actually headquartered in San Francisco. Recently Barney Pell has stepped down from the CEO spot and is now the company’s CTO.

Barney Pell, Ph.D. (CTO) For over 15 years Barney Pell (Ph.D. Computer science, Cambridge University 1993) has pursued groundbreaking technical and commercial innovation in A.I. and Natural Language understanding at research institutions including NASA, SRI, Stanford University and Cambridge University. In startup companies, Dr. Pell was Chief Strategist and VP of Business Development at StockMaster.com (acquired by Red Herring in March, 2000) and later had the same role at Whizbang! Labs. Just prior to Powerset, Pell was an Entrepreneur in Residence at Mayfield, one of the top VC firms in Silicon Valley.

Lorenzo Thione (Product Architect) Mr. Thione brings to Powerset years of research experience in computational linguistics and search from Research Scientist positions at the CommerceNet consortium and the Fuji-Xerox Palo Alto Laboratory. His main research focus has been discourse parsing and document analysis, automatic summarization, question answering and natural language search, and information retrieval. He has co-authored publications in the field of computational linguistics and is a named inventor on 13 worldwide patent applications spanning the fields of computational linguistics, mobile user interfaces, search and information retrieval, speech technology, security and distributed computing. A native of Milan, Italy, Mr. Thione holds a Masters in Software Engineering from the University of Texas at Austin.

Board of Directors

Aside Barney Pell, who is also serving on the company’s board of directors, other board members are:

Charles Moldow (BOD) is a general partner at Foundation Capital. He joined Foundation on the heels of successfully building two companies from early start-up through greater than $100 million in sales. Most notably, Charles led Tellme Networks in raising one of the largest private financing rounds in the country post Internet bubble, adding $125 million in cash to the company balance sheet during tough market conditions in August, 2000. Prior to Tellme, Charles was a member of the founding team of Internet access provider @Home Network. In 1998, Charles assisted in the $7 billion acquisition of Excite Network. After the merger, Charles became General Manager of Matchlogic, the $80 million division focused on interactive advertising.

Peter Thiel (BOD) is a partner at Founders Fund VC Firm in San Francisco. In 1998, Peter co-founded PayPal and served as its Chairman and CEO until the company’s sale to eBay in October 2002 for $1.5 billion. Peter’s experience in finance includes managing a successful hedge fund, trading derivatives at CS Financial Products, and practicing securities law at Sullivan & Cromwell. Peter received his BA in Philosophy and his JD from Stanford.

Investors

In June 2007 Powerset has raised $12.5M in series A round of funding from Foundation Capital and The Founder’s Fund. Early investors include Eric Tilenius and Peter Thiel, who is also early investor in Facebook.com. Other early investors are as follows:

CommerceNet is an entrepreneurial research institute focused on making the world a better place by fulfilling the promise of the Internet. CommerceNet invests in exceptional people with bold ideas, freeing them to pursue visions outside the comfort zone of research labs and venture funds and share in their success.

Dr. Tenenbaum is a world-renowned Internet commerce pioneer and visionary. He was founder and CEO of Enterprise Integration Technologies, the first company to conduct a commercial Internet transaction (1992), secure Web transaction (1993) and Internet auction (1993). In 1994, he founded CommerceNet to accelerate business use of the Internet. In 1997, he co-founded Veo Systems, the company that pioneered the use of XML for automating business-to-business transactions. Dr. Tenenbaum joined Commerce One in January 1999, when it acquired Veo Systems. As Chief Scientist, he was instrumental in shaping the company’s business and technology strategies for the Global Trading Web. Earlier in his career, Dr. Tenenbaum was a prominent AI researcher, and led AI research groups at SRI International and Schlumberger Ltd. Dr. Tenenbaum is a Fellow and former board member of the American Association for Artificial Intelligence, and a former Consulting Professor of Computer Science at Stanford. He currently serves as an officer and director of Webify Solutions and Medstory Inc., and is a Consulting Professor of Information Technology at Carnegie Mellon’s new West Coast campus. Dr. Tenenbaum holds B.S. and M.S. degrees in Electrical Engineering from MIT, and a Ph.D. from Stanford. 

Allan Schiffman was CTO and founder of Terisa Systems, a pioneer in communications security Technology to the Web software industry. Earlier, Mr. Schiffman was Chief Technology Officer at Enterprise Integration Technologies, a pioneer in the development of key security protocols for electronic commerce over the Internet. In these roles, Mr. Schiffman has raised industry awareness of role for security and public key cryptography in ecommerce by giving more than thirty public lectures and tutorials. Mr. Schiffman was also a member of the team that designed the Secure Electronic Transactions (SET) payment card protocol commissioned by MasterCard and Visa. Mr. Schiffman co-designed the first security protocol for the Web, the Secure HyperText Transfer Protocol (S-HTTP). Mr. Schiffman led the development of the first secure Web browser, Secure Mosaic, which was fielded to CommerceNet members for ecommerce trials in 1994. Earlier in his career, Mr. Schiffman led the development of a family of high-performance Smalltalk implementations that gained both academic recognition and commercial success. These systems included several innovations widely adopted by other object-oriented language implementers, such as the “just-in-time compilation” technique universally used by current Java virtual machines. Mr. Schiffman holds an M.S. in Computer Science from Stanford University.

Rob Rodin is the Chairman and CEO of RDN Group; strategic advisors focused on corporate transitions, customer interface, sales and marketing, distribution and supply chain management. Additionally, he serves as Vice Chairman, Executive Director and Chairman of the Investment Committee of CommerceNet which researches and funds open platform, interoperable business services to advance commerce. Prior to these positions, Mr. Rodin served as CEO and President of Marshall Industries, where he engineered the reinvention of the company, turning a conventionally successful $500 million distributor into a web enabled $2 billion global competitor. “Free, Perfect and Now: Connecting to the Three Insatiable Customer Demands”, Mr. Rodin’s bestselling book, chronicles the radical transformation of Marshall Industries. 

The Founders Fund – The Founders Fund, L.P. is a San Francisco-based venture capital fund that focuses primarily on early-stage, high-growth investment opportunities in the technology sector. The Fund’s management team is composed of investors and entrepreneurs with relevant expertise in venture capital, finance, and Internet technology. Members of the management team previously led PayPal, Inc. through several rounds of private financing, a private merger, an initial public offering, a secondary offering, and its eventual sale to eBay, Inc. The Founders Fund possesses the four key attributes that well-position it for success: access to elite research universities, contact to entrepreneurs, operational and financial expertise, and the ability to pick winners. Currently, the Founders Fund is invested in over 20 companies, including Facebook, Ironport, Koders, Engage, and the newly-acquired CipherTrust. 

Amidzad – Amidzad is a seed and early-stage venture capital firm focused on investing in emerging growth companies on the West Coast, with over 50 years of combined entrepreneurial experience in building profitable, global enterprises from the ground up and over 25 years of combined investing experience in successful information technology and life science companies. Over the years, Amidzad has assembled a world-class network of serial entrepreneurs, strategic investors, and industry leaders who actively assist portfolio companies as Entrepreneur Partners and Advisors.Amidzad has invested in companies like Danger, BIX, Songbird, Melodis, Freewebs, Agitar, Affinity Circles, Litescape and Picaboo.

Eric Tilenius brings a two-decade track record that combines venture capital, startup, and industry-leading technology company experience. Eric has made over a dozen investments in early-stage technology, internet, and consumer start-ups around the globe through his investment firm, Tilenius Ventures. Prior to forming Tilenius Ventures, Eric was CEO of Answers Corporation (NASDAQ: ANSW), which runs Answers.com, one of the leading information sites on the internet. He previously was an entrepreneur-in-residence at venture firm Mayfield. Prior to Mayfield, Eric was co-founder, CEO, and Chairman of Netcentives Inc., a leading loyalty, direct, and promotional internet marketing firm. Eric holds an MBA from the Stanford University Graduate School of Business, where he graduated as an Arjay Miller scholar, and an undergraduate degree in economics, summa cum laude, from Princeton University.

Esther Dyson does business as EDventure, the reclaimed name of the company she owned for 20-odd years before selling it to CNET Networks in 2004. Her primary activity is investing in start-ups and guiding many of them as a board member. Her board seats include Boxbe, CVO Group (Hungary), Eventful.com, Evernote, IBS Group (Russia, advisory board), Meetup, Midentity (UK), NewspaperDirect, Voxiva, Yandex (Russia)… and WPP Group (not a start-up). Some of her other past IT investments include Flickr and Del.icio.us (sold to Yahoo!), BrightMail (sold to Symantec), Medstory (sold to Microsoft), Orbitz (sold to Cendant and later re-IPOed). Her current holdings include ActiveWeave, BlogAds, ChoiceStream, Democracy Machine, Dotomi, Linkstorm, Ovusoft, Plazes, Powerset, Resilient, Tacit, Technorati, Visible Path, Vizu.com and Zedo. On the non-profit side, Dyson sits on the boards of the Eurasia Foundation, the National Endowment for Democracy, the Santa Fe Institute and the Sunlight Foundation. She also blogs occasionally for the Huffington Post, as Release 0.9.

Adrian Weller – Adrian graduated in 1991 with first class honours in mathematics from Trinity College, Cambridge, where he met Barney. He moved to NY, ran Goldman Sachs’ US Treasury options trading desk and then joined the fixed income arbitrage trading group at Salomon Brothers. He went on to run US and European interest rate trading at Citadel Investment Group in Chicago and London. Recently, Adrian has been traveling, studying and managing private investments. He resides in Dublin with his wife, Laura and baby daughter, Rachel.

Azeem Azhar – Azeem is currently a technology executive focussed on corporate innovation at a large multinational. He began his career as a technology writer, first at The Guardian and then The Economist . While at The Economist, he launched Economist.com. Since then, he has been involved with several internet and technology businesses including launching BBC Online and founding esouk.com, an incubator. He was Chief Marketing Officer for Albert-Inc, a Swiss AI/natural language processing search company and UK MD of 20six, a blogging service. He has advised several internet start-ups including Mondus, Uvine and Planet Out Partners, where he sat on the board. He has a degree in Philosophy, Politics and Economics from Oxford University. He currently sits on the board of Inuk Networks, which operates a IPTV broadcast platform. Azeem lives in London with his wife and son.

Todd Parker – Since 2002, Mr. Parker has been a Managing Director at Hidden River, LLC, a firm specializing in Mergers and Acquisitions consulting services to the wireless and communications industry. Previously and from 2000 to 2002, Mr. Parker was the founder and CEO of HR One, a human resources solutions provider and software company. Mr. Parker has also held senior executive and general manager positions with AirTouch Corporation where he managed over 15 corporate transactions and joint venture formations with a total value of over $6 billion. Prior to AirTouch, Mr. Parker worked for Arthur D. Littleas a consultant. Mr. Parker earned a BS from Babson College in Entrepreneurial Studies and Communications.

Powerset.com is the 2nd Semantic App being featured by Web2Innovations in its series of planned publications where we will try to discover, highlight and feature the next generation of web-based semantic applications, engines, platforms, mash-ups, machines, products, services, mixtures, parsers, and approaches and far beyond.

The purpose of these publications is to discover and showcase today’s Semantic Web Apps and projects. We’re not going to rank them, because there is no way to rank these apps at this time – many are still in alpha and private beta.

Via

[ http://www.powerset.com ]
[ http://www.powerset.com/about ]
[ http://en.wikipedia.org/wiki/Power_set ]
[ http://en.wikipedia.org/wiki/Powerset ]
[ http://blog.powerset.com/ ]
[ http://lucene.apache.org/hadoop/index.html ]
[ http://wiki.apache.org/lucene-hadoop/Hbase ]
[ http://blog.powerset.com/2007/10/16/powerset-empowered-by-hadoop ]
[ http://www.techcrunch.com/2007/09/04/cuill-super-stealth-search-engine-google-has-definitely-noticed/ ]
[ http://www.barneypell.com/ ]
[ http://valleywag.com/tech/rumormonger/hanky+panky-ousts-pell-as-powerset-ceo-318396.php ]
[ http://www.crunchbase.com/company/powerset ]

Is Google trying to become a Social Search Engine

Based on what we are seeing the answer is close to yes. Google is now experimenting with new social features aimed at improving the users’ search experience.

This experiment lets you influence your search experience by adding, moving, and removing search results. When you search for the same keywords again, you’ll continue to see those changes. If you later want to revert your changes, you can undo any modifications you’ve made. Note that Google claims this is an experimental feature and may be available for only a few weeks.

There seems to be features like “Like it”, “Don’t like it?” and “Know of a better web page”. Of course, to get full advantage of these extras as well as to have your recommendations associated with your searches later, upon your return, you have to be signed in.

There is nothing new here, many of the smaller social search engines are deploying and using some of the features Google is just now trying to test, but having more than 500 million unique visitors per month, the vast majority of which are heavily using Google’s search engine, is a huge advantage if one wants to implement social elements in finding the information on web easily. Even Marissa Mayer, Google’s leading executive in search, said in August that Google would be well positioned to compete in social search. Actually with that experiment in particular it appears your vote only applies to what Google search results you will see, so it is hard to call it “social” at this time around. This may prove valuable as a stand-alone service. Also, Daniel Russell of Google, some time ago, made it pretty clear that they use user behavior to affect search results. Effectively, that’s using implicit voting, rather than explicit voting.

We think, however, the only reason Google is trying to deal with these social features, relying on humans to determine the relevancy, is their inability to effectively fight the spam their SERPs are flooded with. 

Manipulating algorithmic based results, in one way or another is in our understanding not much harder than what you would eventually be able to do to manipulate or influence results in Google that rely and depend on social recommendations. Look at Digg for example.

We think employing humans to determine which results are best is basically an effective pathway to corruption, which is sort of worse than to have an algorithm to blame for the spam and low quality of the results. Again take a look at Digg, dmoz.org and mostly Wikipedia. Wikipedia, once a good idea, became a battle field for corporate, brand, political and social wars. Being said that, we think the problem of Google with the spam results lies down to the way how they reach to the information or more concrete the methods they use to crawl and index the vast Web. Oppositely, having people, instead of robots, gathering the quality and important information (from everyone’s point of view) from around the web is in our understanding much better and effective approach rather than having all the spam results loaded on the servers and then let the people sort them out.

That’s not the first time Google is trying new features with their search results. We remember searchmash.com. Searchmash.com is yet another of the Google’s toys in the search arena, which was quietly started out a year ago because Google did not want the public to know about this project and influence their beta testers (read: the common users) with the brand name Google. The project, however, quickly became poplar since many people discovered who the actual owner of the beta project is.

Google is under no doubt getting all the press attention they need, no matter what they do and sometimes even more than what they do actually need from. On the other hand things seem to be slowly changing today and influential media like New York Times, Newsweek, CNN and many others are in a quest for the next search engine, the next Google. This was simply impossible to happen during 2001, 2002 up to 2004, period characterized with a solid media comfort for Google’s search engine business.  

So, is Google the first one to experiment with social search approaches, features, methods and extras? No, definitely not as you are going to see for yourself from the companies and projects listed below.

As for crediting a Digg-like system with the idea of sorting content out based on community voting, they definitely weren’t the first. The earliest implementation of this we are aware of is Kuro5hin.org (http://en.wikipedia.org/wiki/Kuro5hin), which, we think, was founded back in 1999.

Eurekster

One of the first and oldest companies coined social search engines on Web is Eureskter. 
Eurekster launched its community-powered social search platform “swicki”, as far as we know, in 2004, and explicit voting functionality in 2006. To date, over 100,000 swickis have been built, each serving a community of users passionate about a specific topic. Eurekster processes over 25,000,000 searches a month. The key to Eurekster’s success in improving relevancy here has been leveraging the explicit (and implicit) user behavior though at the group or community level, not individual or general. On the other hand Eurekster never made it to the mainstream users and somehow the company slowly faded away, lost the momentum.

Wikia Social Search

Wikia was founded by Jimmy Wales (Wikipedia’s founder) and Angela Beesley in 2004. The company is incorporated in Delaware. Gil Penchina became Wikia’s CEO in June 2006, at the same time the company moved its headquarters from St. Petersburg, Florida, to Menlo Park and later to San Mateo in California. Wikia has offices in San Mateo and New York in the US, and in PoznaÅ„ in Poland. Remote staff is also located in Chile, England, Germany, Japan, Taiwan, and also in other locations in Poland and the US. Wikia has received two rounds of investment; in March 2006 from Bessemer Venture Partners and in December 2006 from Amazon.com.

According to the Wikia Search the future of Internet Search must be based on:

  1. Transparency – Openness in how the systems and algorithms operate, both in the form of open source licenses and open content + APIs.
  2. Community – Everyone is able to contribute in some way (as individuals or entire organizations), strong social and community focus.
  3. Quality – Significantly improve the relevancy and accuracy of search results and the searching experience.
  4. Privacy – Must be protected, do not store or transmit any identifying data.

Other active areas of focus include:

  1. Social Lab – sources for URL social reputation, experiments in wiki-style social ranking.
  2. Distributed Lab – projects focused on distributed computing, crawling, and indexing. Grub!
  3. Semantic Lab – Natural Language Processing, Text Categorization.
  4. Standards Lab – formats and protocols to build interoperable search technologies.

Based on who Jimmy Wales is and the success he achieved with Wikipedia therefore the resources he might have access to, Wikia Search stands at good chances to survive against any serious competition by Google.

NosyJoe.com

NosyJoe is yet another great example of social search engine that employs intelligent tagging technologies and runs on a semantic platform.

NosyJoe is a social search engine that relies on you to sniff for and submit the web’s interesting content and offers basically meaningful search results in the form of readable complete sentences and smart tags. NosyJoe is built upon the fundamental belief people are better than robots in finding the interesting, important and quality content around Web. Rather than crawling the entire Web building a massive index of information, which aside being an enormous technological task, requires huge amount of resources and is time consuming process would also load lots of unnecessary information people don’t want, NosyJoe is focused just on those parts of the Web people think are important and find interesting enough to submit and share with others.

NosyJoe is a hybrid of a social search engine that relies on you to sniff for and submit the web’s interesting content, an intelligent content tagging engine on the back end and a basic semantic platform on its web visible part. NosyJoe then applies a semantic based textual analysis and intelligently extracts the meaningful structures like sentences, phrases, words and names from the content in order to make it just one idea more meaningfully searchable. This helps us present the search results in basically meaningful formats like readable complete sentences and smart phrasal, word and name tags.

The information is then clustered and published across the NosyJoe’s platform into contextual channels, time and source categories and semantic phrasal, name and word tags are also applied to meaningfully connect them together, which makes even the smallest content component web visible, indexable and findable. At the end a set of algorithms and user patterns are applied to further rank, organize and share the information.

From our quick tests on the site the search results returned were presented in form of meaningful sentences and semantic phrasal tags (as an option), which turns their search results into — something we have never seen on web so far — neat content components, readable and easily understandable sentences, unlike what we are all used to, some excerpts from the content where the keyword is found in. When compared to other search engines’ results NosyJoe.com’s SERPs appear truly meaningful.

As of today, and just 6 or 7 months since they went online, NosyJoe is already having more than 500,000 semantic tags created that connect tens of thousands of meaningful sentences across their platform.

We have no information as to who stays behind NosyJoe but the project seems very serious and promising in many aspects from how they gather the information to how they present the results to the way they offset low quality results. From all newcomers social search engines NosyJoe stands at best changes to make it. As far as we know NosyJoe is also based in the Silicon Valley. 

Sproose

Sproose says it is developing search technology that lets users obtain personalized results, which can be shared among a social network, using the Nutch open-source search engine, and building applications on top. Their search appears to using third party search feeds and ranks the results based on the users’ votes.

Sproose is said it has raised nearly $1 million in seed funding. It is based in Danville, a town on the east side of the SF Bay Area. Sproose said Roger Smith, founder, former president and chief executive at Silicon Valley Bank, was one of the angel investors, and is joining Sproose’s board.

Other start-up search engines of great variety are listed below:

  • Hakia – Relies on natural language processing. These guys are also experimenting with social elements with the feature so called “meet others who asked the same query“.
  • Quintura – A visual engine based today in Virginia, US. The company is founded by Russians and has early been headquartered in Moscow. 
  • Mahalo – search engine that looks more like a directory with quality content handpicked by editors. Jason Calacanis is the founder of the company.
  • ChaCha – Real humans try to help you in your quest for information, via chat. The company is based in Indiana and has been criticized a lot by the Silicon Valley’s IT community. Despite these critics they have recently raised $10m in Series A round of funding. 
  • Powerset – Still in closed beta and also relying on understanding the natural language. See our Powerset review.  
  • Clusty – founded in 2000 by three Carnegie Mellon University scientists.
  • Lexxe – Sydney based engine featuring natural language processing technologies.
  • Accoona – The company has recently filed for an IPO in US planning to raise $80M from the public.
  • Squidoo – It has been started in October 2005 by Seth Godin and looks more like a wiki site, ala Wikia or Wikipedia where anyone creates articles on different topics.
  • Spock – Focuses on people information, people search engine.

One thing is for sure today; Google is now bringing solid credentials to and is somehow legitimating the social search approach, which by the way is helping those so many smaller so-called social search engines. 

Perhaps it is about time for consolidation in the social search sector. Some of the smaller but more promising social search engines can now become one in order to be able to compete with and prevent Google’s dominance within the social search sector too, just like what they did with the algorithmic search engines. Is Google also interested in? Anyone heard of recent interest in or already closed acquisition deals for start-up social search engines?

On the contrary, more and more IT experts, evangelists and web professionals agree on the fact that taking Google down is a challenge that will most likely be accomplished by a concept that is anything else but not a search engine in our traditional understanding. Such concepts, including but not limited to, are Wikipedia, Del.icio.us and LinkedWords. In other words finding information on web doesn’t necessarily mean to search for it.

Via:
[ http://www.google.com/experimental/a840e102.html ]
[ http://www.blueverse.com/2007/12/01/google-the-social-…]
[ http://www.adesblog.com/2007/11/30/google-experimenting-social… ]
[ http://www.techcrunch.com/2007/11/28/straight-out-of-left-field-google-experimenting-with-digg-style-voting-on-search-results ]
[ http://www.blogforward.com/money/2007/11/29/google… ]
[ http://nextnetnews.blogspot.com/2007/09/is-nosyjoecom-… ]
[ http://www.newsweek.com/id/62254/page/1 ]
[ http://altsearchengines.com/2007/10/05/the-top-10-stealth-… ]
[ http://www.nytimes.com/2007/06/24/business/yourmoney/…  ]
[ http://dondodge.typepad.com/the_next_big_thing/2007/05… ]
[ http://search.wikia.com/wiki/Search_Wikia ]
[ http://nosyjoe.com/about.com ]
[ http://www.siliconbeat.com/entries/2005/11/08/sproose_up_your… ]
[ http://nextnetnews.blogspot.com/2007/10/quest-for-3rd-generation… ]
[ http://www.sproose.com ]

Freebase: open, shared database of the world’s knowledge

Freebase, created by Metaweb Technologies, is an open database of the world’s information. It’s built by the community and for the community – free for anyone to query, contribute to, build applications on top of, or integrate into their websites.

Already, Freebase covers millions of topics in hundreds of categories. Drawing from large open data sets like Wikipedia, MusicBrainz, and the SEC archives, it contains structured information on many popular topics, including movies, music, people and locations – all reconciled and freely available via an open API. This information is supplemented by the efforts of a passionate global community of users who are working together to add structured information on everything from philosophy to European railway stations to the chemical properties of common food ingredients.

By structuring the world’s data in this manner, the Freebase community is creating a global resource that will one day allow people and machines everywhere to access information far more easily and quickly than they can today.

Freebase  aims to “open up the silos of data and the connections between them”, according to founder Danny Hillis at the Web 2.0 Summit. Freebase is a database that has all kinds of data in it and an API. Because it’s an open database, anyone can enter new data in Freebase. An example page in the Freebase db looks pretty similar to a Wikipedia page. When you enter new data, the app can make suggestions about content. The topics in Freebase are organized by type, and you can connect pages with links, semantic tagging. So in summary, Freebase is all about shared data and what you can do with it.

The Company behind

Metaweb Technologies, Inc. is a company based in San Francisco that is developing Metaweb, a semantic data storage infrastructure for the web, and its first application built on that platform named Freebase, described as an “open, shared database of the world’s knowledge”. The company was founded by Danny Hillis and others as a spinoff of Applied Minds in July, 2005, and operated in stealth mode until 2007.

Reach

According to Quantcast, which we believe is very accurate, Freebase is basically not popular site, despite the press attention it gets, and is reaching less than 5000 unique visitors per month. Compete is reporting for slightly more than 8000 uniques per month.

The People

William Daniel “Danny” Hillis (born September 25, 1956, in Baltimore, Maryland) is an American inventor, entrepreneur, and author. He co-founded Thinking Machines Corporation, a company that developed the Connection Machine, a parallel supercomputer designed by Hillis at MIT. He is also co-founder of the Long Now Foundation, Applied Minds, Metaweb Technologies, and author of The Pattern on the Stone: The Simple Ideas That Make Computers Work.

Investors

In March 2006, Freebase received $15 million in funding from investors including Benchmark Capital, Millennium Technology Ventures and Omidyar Network.

Freebase is in alpha.

Freebase.com is the first Semantic App being featured by Web2Innovations in its series of planned publications where we will try to discover, highlight and feature the next generation of web-based semantic applications, engines, platforms, mash-ups, machines, products, services, mixtures, parsers, and approaches and far beyond.

The purpose of these publications is to discover and showcase today’s Semantic Web Apps and projects. We’re not going to rank them, because there is no way to rank these apps at this time – many are still in alpha and private beta.

[ http://freebase.com ]
[ http://roblog.freebase.com ]
[ http://www.crunchbase.com/company/freebase ]
[ http://www.readwriteweb.com/archives/10_semantic_apps_to_watch.php ]
[ http://en.wikipedia.org/wiki/Danny_Hillis ]
[ http://www.metaweb.com ]
[ http://en.wikipedia.org/wiki/Metaweb_Technologies ]