Tag Archives: Semantic Web

Massive second round of funding for Freebase – $42 Million

Freebase, the open and shared database of the world’s knowledge, has raised a whopping amount of money in its Series B round of funding, $42 Million, in a round that included Benchmark Capital and Goldman Sachs. Total funding to date is $57 million.

The investment is considerable, and comes at a time when a number of experts are betting that a more powerful, “semantic” Web is about to emerge, where data about information is much more structured than it is today.

In March 2006, Freebase received $15 million in funding from investors including Benchmark Capital, Millennium Technology Ventures and Omidyar Network.

Freebase, created by Metaweb Technologies, is an open database of the world’s information. It’s built by the community and for the community – free for anyone to query, contribute to, build applications on top of, or integrate into their websites.

Already, Freebase covers millions of topics in hundreds of categories. Drawing from large open data sets like Wikipedia, MusicBrainz, and the SEC archives, it contains structured information on many popular topics, including movies, music, people and locations – all reconciled and freely available via an open API. This information is supplemented by the efforts of a passionate global community of users who are working together to add structured information on everything from philosophy to European railway stations to the chemical properties of common food ingredients.

By structuring the world’s data in this manner, the Freebase community is creating a global resource that will one day allow people and machines everywhere to access information far more easily and quickly than they can today.

Freebase  aims to “open up the silos of data and the connections between them”, according to founder Danny Hillis at the Web 2.0 Summit. Freebase is a database that has all kinds of data in it and an API. Because it’s an open database, anyone can enter new data in Freebase. An example page in the Freebase db looks pretty similar to a Wikipedia page. When you enter new data, the app can make suggestions about content. The topics in Freebase are organized by type, and you can connect pages with links, semantic tagging. So in summary, Freebase is all about shared data and what you can do with it.

Here’s a video tour of how does Freebase work. Freebase categorizes knowledge according to thousands of “types” of information, such as film, director or city. Those are the highest order of categorization. Then underneath those types you have “topics,” which are individual examples of the types — such as Annie Hall and Woody Allen. It boasts two million topics to date. This lets Freebase represent information in a structured way, to support queries from web developers wanting to build applications around them. It also solicits people to contribute their knowledge to the database, governed by a community of editors. It offers a Creative Commons license so that it can be used to power applications, on an open API.

This is one of the biggest Series B rounds for the past 12 months. And probably what Google tries to do with its Knol to Wikipedia is the same what Freebase tries to achieve too – replicate and commercialize the huge success of the non-profit Wikipedia.

Other semantic applications and projects include Powerset, Twine, AdaptiveBlue, Hakia, Talis, LinkedWords, NosyJoe, TrueKnowledge, among others.

Peter Rip, an investor in Twine has quickly reacted on the comparison between the two Freebase and Twine the VentureBeat’s Matt Marshall made.

As an investor in Twine, allow me correct you about Twine and Metaweb’s positioning. You correctly point out that Metaweb is building a database about concepts and things on the Web. Twine is not. Twine is really more of an application than a database. It is a way for persons to share information about their interests. So they are complementary, not competitive.

What’s most important is that Twine will be able to use all the structure in something like Metaweb (and other content sources) to enrich the user’s ability to track and manage information. Think of Metaweb as a content repository and Twine as as the app that uses content for specific purposes.

Twine is still in closed beta. So the confusion is understandable, especially with all the hype surrounding the category.

Nova Spivack, the founder of Twine has also commented on.

Freebase and Twine are not competitive. That should be corrected in the above article. In fact our products are very different and have different audiences. Twine is for helping people and groups share knowledge around their interests and activities. It is for managing personal and group knowledge, and ultimately for building smarter communities of interest and smarter teams.

Metaweb, by contrast, is a data source that Twine can use, but is not focused on individuals or on groups. Rather Metaweb is building a single public information database, that is similar to the Wikipedia in some respects. This is a major difference in focus and functionality. To use an analogy, Twine is more like a semantic Facebook, and Metaweb is more like a semantic Wikipedia.

Freebase is in alpha.

Freebase.com was the first Semantic App being featured by Web2Innovations in its series of planned publications where we will try to discover, highlight and feature the next generation of web-based semantic applications, engines, platforms, mash-ups, machines, products, services, mixtures, parsers, and approaches and far beyond.

The purpose of these publications is to discover and showcase today’s Semantic Web Apps and projects. We’re not going to rank them, because there is no way to rank these apps at this time – many are still in alpha and private beta.
More

http://www.metaweb.com/about/
http://freebase.com
http://roblog.freebase.com
http://venturebeat.com/2008/01/14/shared-database-metaweb-gets-42m-boost/
http://www.techcrunch.com/2008/01/16/freebase-takes-42-million/
http://www.dmwmedia.com/news/2008/01/15/freebase-developer-metaweb-technologies-gets-$42.4-million
http://www.crunchbase.com/company/freebase
http://www.readwriteweb.com/archives/10_semantic_apps_to_watch.php
http://en.wikipedia.org/wiki/Danny_Hillis
http://www.metaweb.com
http://en.wikipedia.org/wiki/Metaweb_Technologies
https://web2innovations.com/money/2007/11/30/freebase-open-shared-database-of-the-worlds-knowledge/
http://mashable.com/2007/07/17/freebase/
http://squio.nl/blog/2007/04/02/freebase-life-the-universe-and-everything/

Hakia takes on major search engines backed up by a small army of international investors

In our planned series of publications about the Semantic Web and its Apps today Hakia is our 3rd featured company.

Hakia.com, just like Freebase and Powerset is also heavily relying on Semantic technologies to produce and deliver hopefully better and meaningful results to its users.

Hakia is building the Web’s new “meaning-based” (semantic) search engine with the sole purpose of improving search relevancy and interactivity, pushing the current boundaries of Web search. The benefits to the end user are search efficiency, richness of information, and time savings. The basic promise is to bring search results by meaning match – similar to the human brain’s cognitive skills – rather than by the mere occurrence (or popularity) of search terms. Hakia’s new technology is a radical departure from the conventional indexing approach, because indexing has severe limitations to handle full-scale semantic search.

Hakia’s capabilities will appeal to all Web searchers – especially those engaged in research on knowledge intensive subjects, such as medicine, law, finance, science, and literature. The mission of hakia is the commitment to search for better search.

Here are the technological differences of hakia in comparison to conventional search engines.

QDEX Infrastructure

  • hakia’s designers broke from decades-old indexing method and built a more advanced system called QDEX (stands for Query Detection and Extraction) to enable semantic analysis of Web pages, and “meaning-based” search. 
  • QDEX analyzes each Web page much more intensely, dissecting it to its knowledge bits, then storing them as gateways to all possible queries one can ask.
  • The information density in the QDEX system is significantly higher than that of a typical index table, which is a basic requirement for undertaking full semantic analysis.
  • The QDEX data resides on a distributed network of fast servers using a mosaic-like data storage structure.
  • QDEX has superior scalability properties because data segments are independent of each other.

SemanticRank Algorithm

  • SemanticRank algorithm of hakia is comprised of innovative solutions from the disciplines of Ontological Semantics, Fuzzy Logic, Computational Linguistics, and Mathematics. 
  • Designed for the expressed purpose of higher relevancy.
  • Sets the stage for search based on meaning of content rather than the mere presence or popularity of keywords.
  • Deploys a layer of on-the-fly analysis with superb scalability properties.
  • Takes into account the credibility of sources among equally meaningful results.
  • Evolves its capacity of understanding text from BETA operation onward.

In our tests we’ve asked Hakia three English-language based questions:

Why did the stock market crash? [ http://www.hakia.com/search.aspx?q=why+did+the+stock+market+crash%3F ]
Where do I get good bagels in Brooklyn? [ http://www.hakia.com/search.aspx?q=where+can+i+find+good+bagels+in+brooklyn ]
Who invented the Internet? [ http://www.hakia.com/search.aspx?q=who+invented+the+internet ]

It basically returned intelligent results for all. For example, Hakia understood that, when we asked “why,” I would be interested in results with the words “reason for”–and produced some relevant ones. 

Hakia  is one of the few promising Alternative Search Engines as being closely watched by Charles Knight at his blog AltSearchEngines.com, with a focus on natural language processing methods to try and deliver ‘meaningful’ search results. Hakia attempts to analyze the concept of a search query, in particular by doing sentence analysis. Most other major search engines, including Google, analyze keywords. The company believes that the future of search engines will go beyond keyword analysis – search engines will talk back to you and in effect become your search assistant. One point worth noting here is that, currently, Hakia still has some human post-editing going on – so it isn’t 100% computer powered at this point and is close to human-powered search engine or combination of the two.

They hope to provide better search results with complex queries than Google currently offers, but they have a long way to catch up, considering Google’s vast lead in the search market, sophisticated technology, and rich coffers. Hakia’s semantic search technology aims to understand the meaning of search queries to improve the relevancy of the search results.

Instead of relying on indexing the web or on the popularity of particular web pages, as many search engines do, hakia tries to match the meaning of the search terms to mimic the cognitive processes of the human brain.

“We’re mainly focusing on the relevancy problem in the whole search experience,” said Dr. Berkan in an interview Friday. “You enter a question and get better relevancy and better results.”

Dr. Berkan contends that search engines that use indexing and popularity algorithms are not as reliable with combinations of four or more words since there are not enough statistics available on which to base the most relevant results.

“What we are doing is an ultimate approach, doing meaning-based searches so we understand the query and the text, and make an association between them by semantic analysis,” he said.

Analyzing whole sentences instead of keywords would indefinitely increase the cost to the company to index and process the world’s information. The case is pretty much the same with Powerset where they are also doing deep contextual analysis on every sentence on every web page and is publicly known fact they have higher cost for indexing and analyzing than Google. Taking into consideration that Google is having more than 450,000 servers in several major data centers and hakia’s indexing and storage costs might be even higher the approach they are taking might cost their investors a fortune to keep the company alive.

It would be interesting enough to find out if hakia is also building their architecture upon the Hbase/Hadoop environment just like Powerset does. 

In the context of indexing and storing the world’s information it worth mentioning that there is yet another start-up search engine called Cuill that’s claiming to have invented a technology for cheaper and faster indexation than Google’s. Cuill claims that their indexing costs will be 1/10th of Google’s, based on new search architectures and relevance methods.

Speaking also for semantic textual analysis and presentation of meaningful results NosyJoe.com is a great example of both, yet it seems it is not going to index and store the world’s information and then apply the contextual analysis to, but rather than is focusing on what is quality and important for the people participating in their social search engine

A few months ago Hakia launched a new social feature called “Meet Others” It will give you the option, from a search results page, to jump to a page on the service where everyone who searches for the topic can communicate.

For some idealized types of searching, it could be great. For example, suppose you were searching for information on a medical condition. Meet Others could connect you with other people looking for info about the condition, making an ad-hoc support group. On the Meet Others page, you’re able to add comments, or connect directly with the people on the page via anonymous e-mail or by Skype or instant messaging.

On the other hand implementing social recommendations and relying on social elements like Hakia’s Meet the Others feature one needs to have huge traffic to turn that interesting social feature into an effective information discovery tool. For example Google with its more than 500 million unique searchers per month can easily beat such social attempts undergone by the smaller players if they only decide to employ, in one way or another, their users to find, determine the relevancy, share and recommend results others also search for. Such attempts by Google are already in place as one can read over here: Is Google trying to become a social search engine.

Reach

According to Quantcast, Hakia is basically not so popular site and is reaching less than 150,000 unique visitors per month. Compete is reporting much better numbers – slightly below 1 million uniques per month. Considering the fact the search engine is still in its beta stage these numbers are more than great. Analyzing further the traffic curve on both measuring sites above it appears that the traffic hakia gets is sort of campaign based, in other words generated due to advertising, promotion or PR activity and is not permanent organic traffic due to heavy usage of the site.

The People

Founded in 2004, hakia is a privately held company with headquarters in downtown Manhattan. hakia operates globally with teams in the United States, Turkey, England, Germany, and Poland.

The Founder of hakia is Dr. Berkan who is a nuclear scientist with a specialization in artificial intelligence and fuzzy logic. He is the author of several articles in this area, including the book Fuzzy Systems Design Principles published by IEEE in 1997. Before launching hakia, Dr. Berkan worked for the U.S. Government for a decade with emphasis on information handling, criticality safety and safeguards. He holds a Ph.D. in Nuclear Engineering from the University of Tennessee, and B.S. in Physics from Hacettepe University, Turkey. He has been developing the company’s semantic search technology with help from Professor Victor Raskin of PurdueUniversity, who specializes in computational linguistics and ontological semantics, and is the company’s chief scientific advisor.

Dr. Berkan resisted VC firms because he worried they would demand too much control and push development too fast to get the technology to the product phase so they could earn back their investment.

When he met Dr. Raskin, he discovered they had similar ideas about search and semantic analysis, and by 2004 they had laid out their plans.

They currently have 20 programmers working on building the system in New York, and another 20 to 30 contractors working remotely from different locations around the world, including Turkey, Armenia, Russia, Germany, and Poland.
The programmers are developing the search engine so it can better handle complex queries and maybe surpass some of its larger competitors.

Management

  • Dr. Riza C. Berkan, Chief Executive Officer
  • Melek Pulatkonak, Chief Operating Officer
  • Tim McGuinness, Vice President, Search
  • Stacy Schinder, Director of Business Intelligence
  • Dr. Christian F. Hempelmann, Chief Scientific Officer
  • John Grzymala, Chief Financial Officer

Board of Directors

  • Dr. Pentti Kouri, Chairman
  •  Dr. Riza C. Berkan, CEO
  • John Grzymala
  • Anuj Mathur, Alexandra Global Fund
  • Bill Bradley, former U.S. Senator
  • Murat Vargi, KVK
  • Ryszard Krauze, Prokom Investments

Advisory Board

  • Prof. Victor Raskin (Purdue University)
  • Prof. Yorick Wilks, (Sheffield University, UK)
  • Mark Hughes

Investors

Hakia is known to have raised $11 million in its first round of funding from a panoply of investors scattered across the globe who were attracted by the company’s semantic search technology.

The New York-based company said it decided to snub the usual players in the venture capital community lining Silicon Valley’s Sand Hill Road and opted for its international connections instead, including financial firms, angel investors, and a telecommunications company.

Poland

Among them were Poland’s Prokom Investments, an investment group active in the oil, real estate, IT, financial, and biotech sectors.

Turkey

Another investor, Turkey’s KVK, distributes mobile telecom services and products in Turkey. Also from Turkey, angel investor Murat Vargi pitched in some funding. He is one of the founding shareholders in Turkcell, a mobile operator and the only Turkish company listed on the New York Stock Exchange.

Malaysia

In Malaysia, hakia secured funding from angel investor Lu Pat Ng, who represented his family, which has substantial investments in companies worldwide.
From Finland, hakia turned to Dr. Pentti Kouri, an economist and VC who was a member of the Nokia board in the 1980s. He has taught at Stanford, Yale, New York University, and HelsinkiUniversity, and worked as an economist at the International Monetary Fund. He is currently based in New York.

United States

In the United States, hakia received funding from Alexandra Investment Management, an investment advisory firm that manages a global hedge fund. Also from the U.S., former Senator and New York Knicks basketball player Bill Bradley has joined the company’s board, along with Dr. Kouri, Mr. Vargi, Anuj Mathur of Alexandra Investment Management, and hakia CEO Riza Berkan.

Hakia was on of the first alternative search engine to make the home page of web 2.0 Innovations in the past year… http://web2innovations.com/hakia.com.php

Hakia.com is the 3rd Semantic App being featured by Web2Innovations in its series of planned publications [  ] where we will try to discover, highlight and feature the next generation of web-based semantic applications, engines, platforms, mash-ups, machines, products, services, mixtures, parsers, and approaches and far beyond.

The purpose of these publications is to discover and showcase today’s Semantic Web Apps and projects. We’re not going to rank them, because there is no way to rank these apps at this time – many are still in alpha and private beta.

Via

[ http://www.hakia.com/ ]
[ http://blog.hakia.com/ ]
[ http://www.hakia.com/about.html ]
[ http://www.readwriteweb.com/archives/hakia_takes_on_google_semantic_search.php ]
[ http://www.readwriteweb.com/archives/hakia_meaning-based_search.php ]
[ http://siteanalytics.compete.com/hakia.com/?metric=uv ]
[ http://www.internetoutsider.com/2007/07/the-big-problem.html ]
[ http://www.quantcast.com/search/hakia.com ]
[ http://www.redherring.com/Home/19789 ]
[ http://web2innovations.com/hakia.com.php ]
[ http://www.pandia.com/sew/507-hakia.html ]
[ http://www.searchenginejournal.com/hakias-semantic-search-the-answer-to-poor-keyword-based-relevancy/5246/ ]
[ http://arstechnica.com/articles/culture/hakia-semantic-search-set-to-music.ars ]
[ http://www.news.com/8301-10784_3-9800141-7.html ]
[ http://searchforbettersearch.com/ ]
[ https://web2innovations.com/money/2007/12/01/is-google-trying-to-become-a-social-search-engine/ ]
[ http://www.web2summit.com/cs/web2006/view/e_spkr/3008 ]
 

Powerset – the natural language processing search engine empowered by Hbase in Hadoop

In our planned series of publications about the Semantic Web and its apps today Powerset is going to be our second company, after Freebase, to be featured. 

Powerset is a Silicon Valley based company building a transformative consumer search engine based on natural language processing. Their unique innovations in search are rooted in breakthrough technologies that take advantage of the structure and nuances of natural language. Using these advanced techniques, Powerset is building a large-scale search engine that breaks the confines of keyword search. By making search more natural and intuitive, Powerset is fundamentally changing how we search the web, and delivering higher quality results.

Powerset’s search engine is currently under development and is closed for the general public. You can always keep an eye on them in order to learn more information about their technology and approach.

Despite all the press attention Powerset is gaining there are too few details publicly available for the search engine. In fact Powerset is lately one of the most buzzed companies in the Silicon Valley, for good or bad.

Power set is a term from the mathematics and means a set S, the power set (or powerset) of S, written P(S) P(S), or 2S, is the set of all subsets of S. In axiomatic set theory (as developed e.g. in the ZFC axioms), the existence of the power set of any set is postulated by the axiom of power set. Any subset F of P(S), is called a family of sets over S.

From the latest information publicly available for Powerset we learn that, just like some other start-up search engines, they are also using Hbase in Hadoop environment to process vast amounts of data.

It also appears that Powerset relies on a number of proprietary technologies such as the XLE, licensed from PARC, ranking algorithms, and the ever-important onomasticon (a list of proper nouns naming persons or places).

  

For any other component, Powerset tries to use open source software whenever available. One of the unsung heroes that form the foundation for all of these components is the ability to process insane amounts of data. This is especially true for a Natural Language search engine. A typical keyword search engine will gather hundreds of terabytes of raw data to index the Web. Then, that raw data is analyzed to create a similar amount of secondary data, which is used to rank search results. Since Powerset’s technology creates a massive amount of secondary data through its deep language analysis, Powerset will be generating far more data than a typical search engine, eventually ranging up to petabytes of data.
Powerset has already benefited greatly from the use of Hadoop: their index build process is entirely based on a Hadoop cluster running the Hadoop Distributed File System (HDFS) and makes use of Hadoop’s map/reduce features.

In fact Google also uses a number of well-known components to fulfill their enormous data processing needs: a distributed file system (GFS) ( http://labs.google.com/papers/gfs.html ), Map/Reduce ( http://labs.google.com/papers/mapreduce.html ), and BigTable ( http://labs.google.com/papers/bigtable.html ).

Hbase is actually the open-source equivalent of Google’s Bigtable, which, as far as we understand the matter is a great technological achievement of the guys behind Powerset. Both JimKellerman and Michael Stack are from Powerset and are the initial contributors of Hbase.

Hbase could be the panacea for Powerset in scaling their index up to Google’s level, yet coping Google’s approach is perhaps not the right direction for a small technological company like Powerset. We wonder if Cuill, yet another start-up search engine that’s claiming to have invented a technology for cheaper and faster indexation than Google’s, has built their architecture upon the Hbase/Hadoop environment.  Cuill claims that their indexing costs will be 1/10th of Google’s, based on new search architectures and relevance methods. If it is true what would the Powerset costs then be considering the fact that Powerset is probably having higher indexing costs even compared to Google, because it does a deep contextual analysis on every sentence on every web page? Taking into consideration that Google is having more than 450,000 servers in several major data centers and Powerset’s indexing and storage costs might be even higher the approach Powerset is taking might be costly business for their investors.

Unless Hbase and Hadoop are the secret answer Powerset relies on to significantly reduce the costs. 

Hadoop is an interesting software platform that lets one easily write and run applications that process vast amounts of data.

Here’s what makes Hadoop especially useful:

  • Scalable: Hadoop can reliably store and process petabytes.
  • Economical: It distributes the data and processing across clusters of commonly available computers. These clusters can number into the thousands of nodes.
  • Efficient: By distributing the data, Hadoop can process it in parallel on the nodes where the data is located. This makes it extremely rapid.
  • Reliable: Hadoop automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures.

Hadoop implements MapReduce, using the Hadoop Distributed File System (HDFS) (see figure below.) MapReduce divides applications into many small blocks of work. HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster. MapReduce can then process the data where it is located.
Hadoop has been demonstrated on clusters with 2000 nodes. The current design target is 10,000 node clusters.
Hadoop is a Lucene sub-project that contains the distributed computing platform that was formerly a part of Nutch.

Hbase’s background

Google’s  Bigtable, a distributed storage system for structured data, is a very effective mechanism for storing very large amounts of data in a distributed environment.  Just as Bigtable leverages the distributed data storage provided by the Google File System, Hbase will provide Bigtable-like capabilities on top of Hadoop. Data is organized into tables, rows and columns, but a query language like SQL is not supported. Instead, an Iterator-like interface is available for scanning through a row range (and of course there is an ability to retrieve a column value for a specific key). Any particular column may have multiple values for the same row key. A secondary key can be provided to select a particular value or an Iterator can be set up to scan through the key-value pairs for that column given a specific row key.

Reach

According to Quantcast, Powerset is basically not popular site and is reaching less than 20,000 unique visitors per month, around 10,000 Americans. Compete is reporting the same – slightly more than 20,000 uniques per month. Considering the fact the search engine is still in its alpha stage these numbers are not that bad.

The People

Powerset has assembled a star team of talented engineers, researchers, product innovators and entrepreneurs to realize an ambitious vision for the future of search. Our team comprises industry leaders from a diverse set of companies including: Altavista, Apple, Ask.com, BBN, Digital, IDEO, IBM, Microsoft, NASA, PARC, Promptu, SRI, Tellme, Whizbang! Labs, and Yahoo!.

Founders of Powerset are Barney Pell and Lorenzo Thione and the company is actually headquartered in San Francisco. Recently Barney Pell has stepped down from the CEO spot and is now the company’s CTO.

Barney Pell, Ph.D. (CTO) For over 15 years Barney Pell (Ph.D. Computer science, Cambridge University 1993) has pursued groundbreaking technical and commercial innovation in A.I. and Natural Language understanding at research institutions including NASA, SRI, Stanford University and Cambridge University. In startup companies, Dr. Pell was Chief Strategist and VP of Business Development at StockMaster.com (acquired by Red Herring in March, 2000) and later had the same role at Whizbang! Labs. Just prior to Powerset, Pell was an Entrepreneur in Residence at Mayfield, one of the top VC firms in Silicon Valley.

Lorenzo Thione (Product Architect) Mr. Thione brings to Powerset years of research experience in computational linguistics and search from Research Scientist positions at the CommerceNet consortium and the Fuji-Xerox Palo Alto Laboratory. His main research focus has been discourse parsing and document analysis, automatic summarization, question answering and natural language search, and information retrieval. He has co-authored publications in the field of computational linguistics and is a named inventor on 13 worldwide patent applications spanning the fields of computational linguistics, mobile user interfaces, search and information retrieval, speech technology, security and distributed computing. A native of Milan, Italy, Mr. Thione holds a Masters in Software Engineering from the University of Texas at Austin.

Board of Directors

Aside Barney Pell, who is also serving on the company’s board of directors, other board members are:

Charles Moldow (BOD) is a general partner at Foundation Capital. He joined Foundation on the heels of successfully building two companies from early start-up through greater than $100 million in sales. Most notably, Charles led Tellme Networks in raising one of the largest private financing rounds in the country post Internet bubble, adding $125 million in cash to the company balance sheet during tough market conditions in August, 2000. Prior to Tellme, Charles was a member of the founding team of Internet access provider @Home Network. In 1998, Charles assisted in the $7 billion acquisition of Excite Network. After the merger, Charles became General Manager of Matchlogic, the $80 million division focused on interactive advertising.

Peter Thiel (BOD) is a partner at Founders Fund VC Firm in San Francisco. In 1998, Peter co-founded PayPal and served as its Chairman and CEO until the company’s sale to eBay in October 2002 for $1.5 billion. Peter’s experience in finance includes managing a successful hedge fund, trading derivatives at CS Financial Products, and practicing securities law at Sullivan & Cromwell. Peter received his BA in Philosophy and his JD from Stanford.

Investors

In June 2007 Powerset has raised $12.5M in series A round of funding from Foundation Capital and The Founder’s Fund. Early investors include Eric Tilenius and Peter Thiel, who is also early investor in Facebook.com. Other early investors are as follows:

CommerceNet is an entrepreneurial research institute focused on making the world a better place by fulfilling the promise of the Internet. CommerceNet invests in exceptional people with bold ideas, freeing them to pursue visions outside the comfort zone of research labs and venture funds and share in their success.

Dr. Tenenbaum is a world-renowned Internet commerce pioneer and visionary. He was founder and CEO of Enterprise Integration Technologies, the first company to conduct a commercial Internet transaction (1992), secure Web transaction (1993) and Internet auction (1993). In 1994, he founded CommerceNet to accelerate business use of the Internet. In 1997, he co-founded Veo Systems, the company that pioneered the use of XML for automating business-to-business transactions. Dr. Tenenbaum joined Commerce One in January 1999, when it acquired Veo Systems. As Chief Scientist, he was instrumental in shaping the company’s business and technology strategies for the Global Trading Web. Earlier in his career, Dr. Tenenbaum was a prominent AI researcher, and led AI research groups at SRI International and Schlumberger Ltd. Dr. Tenenbaum is a Fellow and former board member of the American Association for Artificial Intelligence, and a former Consulting Professor of Computer Science at Stanford. He currently serves as an officer and director of Webify Solutions and Medstory Inc., and is a Consulting Professor of Information Technology at Carnegie Mellon’s new West Coast campus. Dr. Tenenbaum holds B.S. and M.S. degrees in Electrical Engineering from MIT, and a Ph.D. from Stanford. 

Allan Schiffman was CTO and founder of Terisa Systems, a pioneer in communications security Technology to the Web software industry. Earlier, Mr. Schiffman was Chief Technology Officer at Enterprise Integration Technologies, a pioneer in the development of key security protocols for electronic commerce over the Internet. In these roles, Mr. Schiffman has raised industry awareness of role for security and public key cryptography in ecommerce by giving more than thirty public lectures and tutorials. Mr. Schiffman was also a member of the team that designed the Secure Electronic Transactions (SET) payment card protocol commissioned by MasterCard and Visa. Mr. Schiffman co-designed the first security protocol for the Web, the Secure HyperText Transfer Protocol (S-HTTP). Mr. Schiffman led the development of the first secure Web browser, Secure Mosaic, which was fielded to CommerceNet members for ecommerce trials in 1994. Earlier in his career, Mr. Schiffman led the development of a family of high-performance Smalltalk implementations that gained both academic recognition and commercial success. These systems included several innovations widely adopted by other object-oriented language implementers, such as the “just-in-time compilation” technique universally used by current Java virtual machines. Mr. Schiffman holds an M.S. in Computer Science from Stanford University.

Rob Rodin is the Chairman and CEO of RDN Group; strategic advisors focused on corporate transitions, customer interface, sales and marketing, distribution and supply chain management. Additionally, he serves as Vice Chairman, Executive Director and Chairman of the Investment Committee of CommerceNet which researches and funds open platform, interoperable business services to advance commerce. Prior to these positions, Mr. Rodin served as CEO and President of Marshall Industries, where he engineered the reinvention of the company, turning a conventionally successful $500 million distributor into a web enabled $2 billion global competitor. “Free, Perfect and Now: Connecting to the Three Insatiable Customer Demands”, Mr. Rodin’s bestselling book, chronicles the radical transformation of Marshall Industries. 

The Founders Fund – The Founders Fund, L.P. is a San Francisco-based venture capital fund that focuses primarily on early-stage, high-growth investment opportunities in the technology sector. The Fund’s management team is composed of investors and entrepreneurs with relevant expertise in venture capital, finance, and Internet technology. Members of the management team previously led PayPal, Inc. through several rounds of private financing, a private merger, an initial public offering, a secondary offering, and its eventual sale to eBay, Inc. The Founders Fund possesses the four key attributes that well-position it for success: access to elite research universities, contact to entrepreneurs, operational and financial expertise, and the ability to pick winners. Currently, the Founders Fund is invested in over 20 companies, including Facebook, Ironport, Koders, Engage, and the newly-acquired CipherTrust. 

Amidzad – Amidzad is a seed and early-stage venture capital firm focused on investing in emerging growth companies on the West Coast, with over 50 years of combined entrepreneurial experience in building profitable, global enterprises from the ground up and over 25 years of combined investing experience in successful information technology and life science companies. Over the years, Amidzad has assembled a world-class network of serial entrepreneurs, strategic investors, and industry leaders who actively assist portfolio companies as Entrepreneur Partners and Advisors.Amidzad has invested in companies like Danger, BIX, Songbird, Melodis, Freewebs, Agitar, Affinity Circles, Litescape and Picaboo.

Eric Tilenius brings a two-decade track record that combines venture capital, startup, and industry-leading technology company experience. Eric has made over a dozen investments in early-stage technology, internet, and consumer start-ups around the globe through his investment firm, Tilenius Ventures. Prior to forming Tilenius Ventures, Eric was CEO of Answers Corporation (NASDAQ: ANSW), which runs Answers.com, one of the leading information sites on the internet. He previously was an entrepreneur-in-residence at venture firm Mayfield. Prior to Mayfield, Eric was co-founder, CEO, and Chairman of Netcentives Inc., a leading loyalty, direct, and promotional internet marketing firm. Eric holds an MBA from the Stanford University Graduate School of Business, where he graduated as an Arjay Miller scholar, and an undergraduate degree in economics, summa cum laude, from Princeton University.

Esther Dyson does business as EDventure, the reclaimed name of the company she owned for 20-odd years before selling it to CNET Networks in 2004. Her primary activity is investing in start-ups and guiding many of them as a board member. Her board seats include Boxbe, CVO Group (Hungary), Eventful.com, Evernote, IBS Group (Russia, advisory board), Meetup, Midentity (UK), NewspaperDirect, Voxiva, Yandex (Russia)… and WPP Group (not a start-up). Some of her other past IT investments include Flickr and Del.icio.us (sold to Yahoo!), BrightMail (sold to Symantec), Medstory (sold to Microsoft), Orbitz (sold to Cendant and later re-IPOed). Her current holdings include ActiveWeave, BlogAds, ChoiceStream, Democracy Machine, Dotomi, Linkstorm, Ovusoft, Plazes, Powerset, Resilient, Tacit, Technorati, Visible Path, Vizu.com and Zedo. On the non-profit side, Dyson sits on the boards of the Eurasia Foundation, the National Endowment for Democracy, the Santa Fe Institute and the Sunlight Foundation. She also blogs occasionally for the Huffington Post, as Release 0.9.

Adrian Weller – Adrian graduated in 1991 with first class honours in mathematics from Trinity College, Cambridge, where he met Barney. He moved to NY, ran Goldman Sachs’ US Treasury options trading desk and then joined the fixed income arbitrage trading group at Salomon Brothers. He went on to run US and European interest rate trading at Citadel Investment Group in Chicago and London. Recently, Adrian has been traveling, studying and managing private investments. He resides in Dublin with his wife, Laura and baby daughter, Rachel.

Azeem Azhar – Azeem is currently a technology executive focussed on corporate innovation at a large multinational. He began his career as a technology writer, first at The Guardian and then The Economist . While at The Economist, he launched Economist.com. Since then, he has been involved with several internet and technology businesses including launching BBC Online and founding esouk.com, an incubator. He was Chief Marketing Officer for Albert-Inc, a Swiss AI/natural language processing search company and UK MD of 20six, a blogging service. He has advised several internet start-ups including Mondus, Uvine and Planet Out Partners, where he sat on the board. He has a degree in Philosophy, Politics and Economics from Oxford University. He currently sits on the board of Inuk Networks, which operates a IPTV broadcast platform. Azeem lives in London with his wife and son.

Todd Parker – Since 2002, Mr. Parker has been a Managing Director at Hidden River, LLC, a firm specializing in Mergers and Acquisitions consulting services to the wireless and communications industry. Previously and from 2000 to 2002, Mr. Parker was the founder and CEO of HR One, a human resources solutions provider and software company. Mr. Parker has also held senior executive and general manager positions with AirTouch Corporation where he managed over 15 corporate transactions and joint venture formations with a total value of over $6 billion. Prior to AirTouch, Mr. Parker worked for Arthur D. Littleas a consultant. Mr. Parker earned a BS from Babson College in Entrepreneurial Studies and Communications.

Powerset.com is the 2nd Semantic App being featured by Web2Innovations in its series of planned publications where we will try to discover, highlight and feature the next generation of web-based semantic applications, engines, platforms, mash-ups, machines, products, services, mixtures, parsers, and approaches and far beyond.

The purpose of these publications is to discover and showcase today’s Semantic Web Apps and projects. We’re not going to rank them, because there is no way to rank these apps at this time – many are still in alpha and private beta.

Via

[ http://www.powerset.com ]
[ http://www.powerset.com/about ]
[ http://en.wikipedia.org/wiki/Power_set ]
[ http://en.wikipedia.org/wiki/Powerset ]
[ http://blog.powerset.com/ ]
[ http://lucene.apache.org/hadoop/index.html ]
[ http://wiki.apache.org/lucene-hadoop/Hbase ]
[ http://blog.powerset.com/2007/10/16/powerset-empowered-by-hadoop ]
[ http://www.techcrunch.com/2007/09/04/cuill-super-stealth-search-engine-google-has-definitely-noticed/ ]
[ http://www.barneypell.com/ ]
[ http://valleywag.com/tech/rumormonger/hanky+panky-ousts-pell-as-powerset-ceo-318396.php ]
[ http://www.crunchbase.com/company/powerset ]

Freebase: open, shared database of the world’s knowledge

Freebase, created by Metaweb Technologies, is an open database of the world’s information. It’s built by the community and for the community – free for anyone to query, contribute to, build applications on top of, or integrate into their websites.

Already, Freebase covers millions of topics in hundreds of categories. Drawing from large open data sets like Wikipedia, MusicBrainz, and the SEC archives, it contains structured information on many popular topics, including movies, music, people and locations – all reconciled and freely available via an open API. This information is supplemented by the efforts of a passionate global community of users who are working together to add structured information on everything from philosophy to European railway stations to the chemical properties of common food ingredients.

By structuring the world’s data in this manner, the Freebase community is creating a global resource that will one day allow people and machines everywhere to access information far more easily and quickly than they can today.

Freebase  aims to “open up the silos of data and the connections between them”, according to founder Danny Hillis at the Web 2.0 Summit. Freebase is a database that has all kinds of data in it and an API. Because it’s an open database, anyone can enter new data in Freebase. An example page in the Freebase db looks pretty similar to a Wikipedia page. When you enter new data, the app can make suggestions about content. The topics in Freebase are organized by type, and you can connect pages with links, semantic tagging. So in summary, Freebase is all about shared data and what you can do with it.

The Company behind

Metaweb Technologies, Inc. is a company based in San Francisco that is developing Metaweb, a semantic data storage infrastructure for the web, and its first application built on that platform named Freebase, described as an “open, shared database of the world’s knowledge”. The company was founded by Danny Hillis and others as a spinoff of Applied Minds in July, 2005, and operated in stealth mode until 2007.

Reach

According to Quantcast, which we believe is very accurate, Freebase is basically not popular site, despite the press attention it gets, and is reaching less than 5000 unique visitors per month. Compete is reporting for slightly more than 8000 uniques per month.

The People

William Daniel “Danny” Hillis (born September 25, 1956, in Baltimore, Maryland) is an American inventor, entrepreneur, and author. He co-founded Thinking Machines Corporation, a company that developed the Connection Machine, a parallel supercomputer designed by Hillis at MIT. He is also co-founder of the Long Now Foundation, Applied Minds, Metaweb Technologies, and author of The Pattern on the Stone: The Simple Ideas That Make Computers Work.

Investors

In March 2006, Freebase received $15 million in funding from investors including Benchmark Capital, Millennium Technology Ventures and Omidyar Network.

Freebase is in alpha.

Freebase.com is the first Semantic App being featured by Web2Innovations in its series of planned publications where we will try to discover, highlight and feature the next generation of web-based semantic applications, engines, platforms, mash-ups, machines, products, services, mixtures, parsers, and approaches and far beyond.

The purpose of these publications is to discover and showcase today’s Semantic Web Apps and projects. We’re not going to rank them, because there is no way to rank these apps at this time – many are still in alpha and private beta.

[ http://freebase.com ]
[ http://roblog.freebase.com ]
[ http://www.crunchbase.com/company/freebase ]
[ http://www.readwriteweb.com/archives/10_semantic_apps_to_watch.php ]
[ http://en.wikipedia.org/wiki/Danny_Hillis ]
[ http://www.metaweb.com ]
[ http://en.wikipedia.org/wiki/Metaweb_Technologies ]