Algorithms Archives -

8 Jan, 2014

4 Web Sites with Technical Information for Business Executives

My researchers keep track of useful Web sites. As Bing and Google shift from “search” to other information interests, having a library card catalog of Web information resources becomes more important. I want to highlight five “search” services that may be useful to a person conducting business intelligence or basic research.

This site provides basic and advanced search for “scholarly Internet collections.” The free service provides a hotlink to subject categories. A partial listing of the categories includes:

Biology, agriculture, and medical information
Business and economics information
Electronic journals
Physical science, engineering, computer science and mathematics.

The service is provided by the Regents of the University of California. The public facing system was developed by information professionals of the Riverside University of California Riverside.

A query for “online video” returns hits for the Pew Internet and American Life Project. The same query run via the advanced search option provides an interface with a number of helpful options:

Link: http://infomine.ucr.edu/cgi-bin/search

The results were narrowed from 1,600 hits to 341. This resource is more useful for academic queries. I have found it useful for more general queries about specific technical issues such as “thorium reactor”.

The key differentiator with Infomine is that a subject matter expert selected the sites in the system’s index. I am not a fan of uncontrolled comments. Infomine allows a researcher to add information to a specific result.

Directory of Open Access Journals http://www.doaj.org/

Editorial controls on academic journals vary significantly from title to title and publisher to publisher. I have encountered publishers who knowingly leave articles containing bogus information in their commercial indexes. I have brushed against publishers who look “academic” but are really fronts for conference programs. If you don’t have access to commercial databases or don’t know what database to use when searching your local library’s online servicers, navigate to DOAJ.

The system indexes about 10,000 open access journals. Content can vary because some open access journals do not make their articles available without a fee. You can access about 5600 journals and view the full text without paying a fee. The service offers an advanced search function. The interface for my query “Inconel” looks like this:

Link: http://www.doaj.org/search?source={%22query%22:{%22query_string%22:{%22query%22:%22Inconel%22,%22default_operator%22:%22AND%22}}}

The articles were on point. The top hit referenced “phase transformation”, one of the key characteristics of this specialized steel. When looking for technical information, I encourage my researchers to run the query using synonyms across multiple systems. A good rule of thumb is that competitive services have an overlap of 70 to 80 percent. In many cases, the overlap is much lower. Running the same query across multiple indexes usually provides additional insight into useful sources of information.

NSDL (National Science Digital Library) http://nsdl.org/search/results/

NSDL is funded by the National Science Foundation. The system “provides access to high quality online educational resources” for teaching and learning. The system points to resources. For a person investigating or researching a topic, NSDL can pinpoint people, institutions, and content germane to a scientific query. I ran a query for “phase diagram.” The system returned educational materials as the top ranked hit. However, the fourth hit pointed to a technical paper, provided a hot link to the author, and permitted a free download of the PDF of the research report.

If you are looking for other US government technical information, be sure to include Science.gov. Unlike USA.gov, Science.gov taps the Deep Web Technology federated search engine. Another must use site is the www.nist.gov resource. Searching US government Web sites directly often returns results not in the USA.gov search index.

The NSDL system displayed content from the Material Digital Library Pathway. Link: http://matdl.org/repository/view/matdl:211

Science.gov http://www.science.gov/

Science.gov searches over 60 databases and over 2200 selected Web sites from 15 Federal agencies. The system provides access to information on about 200 million pages of content.

Science.gov 5.0 provides the ultimate science search through a variety of features and abilities, including:

Accessing over 55 databases and 200 million pages of science information via one query
Clustering of results by subtopics, authors, or dates to help you target your search
Wikipedia results related to your search terms
Eureka News results related to your search terms
Mark & send option for emailing results to friends and colleagues
Download capabilities in RIS
Enhanced information related to your real-time search
Aggregated Science News Feed, also available on Twitter
Updated Alerts service
Image Search

My test query was “thorium reactor.” The information was on point. The screenshot of the basic query’s search results appear below:

There are quite a few bells and whistle available. I typically focus on the text results, ignoring the related topics and sidebar “snippets.” An advanced search function is not available. There is, however, a “refine search” function that allows the initial search result set to be narrowed. The system displays the first results that come back from the Deep Web federating engine. You can view only these or instruct the system to integrate the full set of hits into the results list. I strongly recommend getting the full results list even though latency is added to the system.

Unlike USA.gov, Science.gov has focused on search. USA.gov is an old-style portal, almost like a Yahoo for the US government’s publicly accessible information. Science.gov is more useful to me, but you may find USA.gov helpful in your research.

I want to highlight other search resources in 2014. I find Google useful, but it is becoming more and more necessary to run queries across different systems. You can find more online research tips and tricks at DeeperQI.com, edited by librarians for business researcher and competitive intelligence professionals.

Stephen E. Arnold

266

4 Dec, 2013

Easily Get Your Open Source Framework Onto Openstack

The number of open source frameworks that are available today is continuously growing at an enormous pace with over 1 million unique open source projects today, as indicated in a recent survey by Black Duck.

The availability of these frameworks has changed the way we build products and structure our IT infrastructure. Products are often built through integration of different open source frameworks rather than developing the entire stack.

The speed of innovation is also an area that has changed as a result of this open source movement. Where, previously, innovation was mostly driven by the speed of development of new features, in the open source world, innovation is greatly determined by how fast we can integrate and take advantage of new open source frameworks.

Having said that, the process of trying out new open source frameworks can still be fairly tedious, especially if there isn’t a well-funded company behind the project. Let me explain why.

Open Source Framework Integration Continuum

The integration of a new open source framework often goes through an evolution cycle that I refer to as the integration continuum.

A typical integration process often involves initial exploration where we continuously explore new frameworks, sometimes without any specific need in mind. At this stage, we want to be able to get a feel of the framework, but don’t necessarily have the time to take a deep dive into the technology or deal with complex configurations. Once we find a framework that we like and see a potential use for it, we start to look closer and run a full POC.

In this case, we already have a better idea of what the product does and how it can fit in our system, but we want to validate our assumptions. Once we are done with the POC and found what were looking for, we start the integration and development stage with our product.

This stage is where we are often interested in getting our hands on the API and the product features, and need a simple environment that will allow us to quickly test the integration of that API with our product. As we get closer to production, we get more interested in the operational aspects and need to deal with building the cluster for high availability, adding more monitoring capabilities and integrating it with our existing operational environment.

Each of those steps often involves downloading the framework, setting up the environment and tuning it to fit the need of the specific stage (i.e. trial, POC, dev, production). For obvious reasons, the requirements for each of those stages are fairly different, so we typically end up with a different setup process with various tools in each step, leading to a process in which we are continuously ripping and replacing the setup from the previous step as we move from one stage to the next.

The Friction Multiplier Effect

The friction that I was referring to above refers to a process for a single framework. In reality, however, during the exploration and POC stages, we evaluate more than one framework in order to choose the one that best fits our needs.

In addition, we often integrate with more than one product or framework at a time. So in reality, the overhead that I was referring to is multiplied by the number of frameworks that we are evaluating and using. The overhead in the initial stages of exploration and POC often ends in complete waste as we typically choose one framework out of several, meaning we spent a lot of time setting up frameworks that we are never going to use.

What about Amazon, GAE or Azure Offerings?

These clouds limit their services to a particular set of pre-canned frameworks and the infrastructure and tools that most clouds offer are not open source. This means that you will not be able to use the same services under your own environment. It also introduces a higher degree of locking.

Flexibility is critical as the industry moves toward more advanced development and production scenarios. This could be a huge stop gap. If you get stuck with the out of the box offering, you have no way to escape, with your only way out to build the entire environment yourself, using other sets of tools to build the entire environment completely from scratch.

A New Proposition

Looking at existing alternatives, there is a need for a hassle-free preview model that enables a user to seamlessly take a demo into production with the flexibility to be fully customizable and adaptable. With GigaSpaces’ Cloudify Application Catalog, we believe that the game is changing. This new service, built on HP’s OpenStack-based Public Cloud, makes the experience of deploying new open source services and applications on OpenStack simpler than on other clouds. By taking an open source approach, it’s guaranteed not to hit a stop-gap as we advance through the process and avoid the risk of a complete re-write or lock-in. At the same time we allow hassle free one-click experience by providing an “as-a-service” offering for deploying any of our open source frameworks of choice. By the fact that we use the same underlying infrastructure and set of tools through all the cycles, users can take their experience and investment from one stage to the other and thus avoid a complete re-write.

Final Words

In today’s world, innovation is key to allowing organizations to keep up with the competition.

The use of open source has enabled everyone to accelerate the speed of innovation, however, the process of exploring and trying open source frameworks is still fairly tedious, with much friction in the transition from the initial exploration to full production. With the new Catalog service many users and organizations can increase their speed of innovation dramatically by simplifying the process of exploring and integrating new open source frameworks.

Nati Shalom

417

21 Nov, 2013

CAPTCHA is Dead, Long Live…

Even before news that the startup Vicarious has found a way to crack CAPTCHA at least 90% of the time, CAPTCHA was broken. Not only that, it is a tedious and increasingly frustrating task. A UCSD study calculates that the average user spends 14 seconds on a CAPTCHA with an error rate of 10%. We were overdue for a new, better way of combating spam long before Vicarious revealed its hack.

CAPTCHA, which stands for Completely Automated Public Turing test to tell Computers and Humans Apart, is actually a reverse Turing test whose aim is to differentiate machines from humans by finding those ways in which a machine is incapable of thinking like a human. A Turing test, named after famed codecracker, mathematician, and early computer scientist Alan Turing, originally described a test of machine intelligence. For Turing, a machine passed the test and could be considered a thinking entity when it could convince a human that it is human. The idea was originally that computers could read text, but not if the text was in an image.

However, as technology improved and advancements like OCR text recognition technology allowed computers to read text from images, CAPTCHAs had to become more and more complicated to foil machines. From wavy lines and letters to crosshatched or noise-filled images, CAPTCHA had to come up with more ways to be illegible to machines while still remaining accessible to humans.

Ironically, one of the most popular CAPTCHA services, the Google-owned reCAPTCHA, actually uses user submitted responses to read some of the copious numbers of books that Google has digitized. In other words, this CAPTCHA service may be helping text-in-image recognition even as it uses text-in-image recognition to confound machines.

The other problem, of course, is that as computers get better at reading text in images, CAPTCHA images become increasingly inscrutable and unintelligible. It has come to the point where making text illegible to machines also makes them pretty illegible to humans. Or illegible enough that humans complain.

Finally, since the goal of CAPTCHA is to differentiate human from machine, CAPTCHA never managed to completely thwart human spammers. At best, these measures only slowed them down.

Images are a bust, and CAPTCHA never really addressed human spammers. We must utilize the other many ways humans differ from machines to ensure that keep the spammers at bay.

Form-Fillers of the World, Divide

While Turing tests will never weed out all spammers, especially human spammers, autofilters are a valuable first defense against all that computer-generated spam. One way humans and spambots definitely differ is in how they fill out forms. Even humans who routinely use the auto-fill functions in their browsers must manually fill in at least one portion of a form. Systems have already been developed that can analyze how a user is filling out a form (how quickly, in what order, etc.). The reverse Turing test could then use behavior itself as a metric to discern machine from human.

Talk Spam to Me, Baby

The language of spam has its own particular grammar. Instead of using CAPTCHA (whose primary purposeis to prevent machine-generated spam), the content of messages could be analyzed to identify and block spam. Like email spam filters, these filter could flag or block messages with particular keywords, suspicious email addresses or suspicious names.

Frustration-Free Forms

The advantage of these methods is that, unlike CAPTCHA, they only require the user not to be posting spam. The ideal spam filter would not add another step to sometimes already frustrating and tedious forms. Losing a user because the spammers got too smart for CAPTCHA is bad for you and your users.

Julia Fominova

1222

13 Nov, 2013

How Relational Databases Could Ruin Christmas

As you know, a bad user experience on Cyber Monday can cause users to abandon an eCommerce site faster than during non-seasonal online shopping, resulting in massive revenue loss. The average revenue loss of just few hours shutdown during Cyber Monday can impact company yearly revenue by 10-15%.

With coupons scattered across the internet and increased overall demand on Cyber Monday, eCommerce websites have to automatically adapt and expand with such fluctuating customer’s behavior. Usually there is limited amount of products that have a special price during this sales period. The first ones that will manage to access the website will get the best deal.

Because most relational databases systems keep data in tables, there is a lock placed on inventory table raw that is in the process of updating. With typical average eCommerce website traffic, this isn’t an issue – but with too many requests at the same time, retail systems can face slow-downs or even a lack of ability to manage the requests due to this table lock behavior. On days like Cyber Monday, sudden, intensive loads create a situation where retailers’ traditional central database systems are burdened by a large number of simultaneous lock requests for the same inventory table raw (this is done in order to update the data with user activities). This burden causes a delay in the backend database response time. In some cases, such a load can bring down the database system and shut down the entire eCommerce system, preventing users from performing any transactions for several hours!

A redesign of inventory management offers, reservations, fulfillment, personalization, pricing and promotional components of eCommerce systems is needed to ensure website survival on Cyber-Monday. The system of record must be elastic , blazing fast, cannot rely on a database or any disk based storage medium, must be highly-available across multiple data centers or cloud environments and provide enterprise level security.

The GigaSpaces XAP in-memory compute product has been the system of record for many leading eCommerce systems around the world, delivering low-latency response time even on massive peak load. The legacy enterprise database is not part of the transaction path, which ensures ability to cope with millions of transactions per minute leveraging commodity inexpensive hardware.

GigaSpaces XAP integrates with leading eCommerce platforms such as Oracle ATG and IBM Portal/eCommerce platforms, allowing users to leverage their investment with the existing eCommerce platform. XAP can be easily integrated with these systems as a reliable and fast system of record that can deliver peace of mind to the IT and the retail executives during the peak load of Cyber Monday. This integration immediately delivers ROI and shorten the time to market when the eCommerce implementation is enhanced to support new products, offers or promotions.

Fortune 200 retailer eCommerce system achieves 70% revenue growth with GigaSpaces XAP

Challenge

For several years, this top retailer’s eCommerce website could not handle the exponential increase in traffic, slowing down the site response time dramatically at some points, and shutting it down altogether at another. Social marketing promotions and online coupons increased amount of simultaneous user sessions and catalog access caused resource contention and total system shutdown of the eCommerce system. This resulted in a multi-million financial loss of revenue during Cyber Monday.

Solution & results

After analyzing the problem, this top retailer integrated GigaSpaces XAP In-Memory Computing solution with its eCommerce system to manage all catalog, offers, promotions, and pricing data using a distributed, transactional data grid. This simple enhancement to the existing eCommerce system improved the overall performance and eliminated the database as a bottleneck for any eCommerce transactions. This new architecture delivers continuous improvements to supports 30% annual growth with the system capacity and 70% revenue growth for Cyber Monday. The eCommerce system now has significantly higher performance — Web performance increased by 500%. The system is also now able to streamline and scale available inventory process, dynamic pricing, promotions processing and business analytics. This was all possible with less hardware infrastructure; the hardware required for Cyber Monday was reduced by 50%.

Businesses need to think about traffic timing differently on one of the busiest days of the year. This is just one solution to a behind-the-scenes problem that should be addressed.

Shay Hassidim

331

14 Oct, 2013

The Software Defined Hype Cycle

Wherever you look, there are clouds. Across every horizon, in every datacenter, a cloud of a public or private nature is found. The use of Cloud as a marketing term has peaked. As the Cloud hype cycle fades, a new branding and marketing term has arisen: Software Defined Networking or SDN.

Rooted in the academic pursuit of Network Virtualization, SDN is programmatic control over networks. With SDN, a network admin can do more with less. In a world that pulses to the tune of light on glass, this is a powerful thing. The acquisitions of Nicira by VMWare (for north of $1B at a rumored >100x revenues) and other companies are fanning the flames. The big companies want in on that money.

The war for the future of information technology is being fought in the datacenter. SDN may be the latest frontier, but is it all hype? Many products, of frankly questionable technical merit, brand themselves as SDN. Any good marketing term in information technology is subject to similar abuse.

Because information technology is so hard for most people to understand, companies naturally gravitate towards anything that resonates with the prospect of increased utility and ease of use (think cloud). Rather than determine whether or not a particular company is real, we’ll derive more benefit from consideration of our current position in the SDN hype cycle. Companies can call their technology whatever they want. The proof is in the RFP process and customer retention, but let’s not digress too far. Today we’ll consider the SDN Hype Cycle!

The beginning of the Hype

OpenFlow, which started at Stanford, was an attempt to introduce abstraction to network management. Abstraction, from the kernel to the operating system and even to the GUIs we use every day, is central to computing. The theoretical innovation of OpenFlow was to allow abstraction across routers and switches irrespective of manufacturer. Although Google deployed OpenFlow in a rather large network, the use of homogenous equipment prevented validation of the general concept. OpenFlow was trucking along at Stanford when Diane Green (CEO of VMWare) suddenly funded one of the primary OpenFlow researchers. That startup was the previously-mentioned Nicira, just a couple of years before their acquisition by VMWare. The private nature of Nicira prevents detailed knowledge of their technology. It is well to note that some very large companies run it in their data centers to control their networks, ostensibly using software.

Nicira was the beginning of the SDN hype cycle.

What is SDN?

To my mind, SDN has 3 elements:

Decoupling networks from services
Providing programmatic control of hardware via software
Orchestrated management of devices from a GUI (and command line)

Decoupling networks from services allows product managers to think about products instead of networks. This is similar to the way Amazon decoupled hardware from applications using AWS.

Providing programmatic control is an extension of decoupling. Decoupling, by itself, is inadequate. For business at scale, manual control is insufficient. In order to provide scaling accessible controls, an API or other control abstraction must be developed.

Orchestrated management is about making each individual contributor more powerful. Whereas previously many man-hours were required for even the most trivial changes, orchestration makes every admin superman. Instead of having to remote into each router, one command, or one click in a GUI, results in transformative change across the infrastructure: one controlling many.

The punch line is this: SDN is about concealing complexity. SDN hides all of the difficult parts of network management behind a facade of simple GUI goodness.

That sounds awesome! What could possibly be wrong with that?

Well, as is sometimes the case in technology, the truth and the hype may not necessarily line up. There are a lot of companies that pitch solutions as SDN when in reality they’re nothing of the sort. To help you separate the wheat from the chaff, here are three helpful guidelines for deciding if something is an SDN contender or pretender:

Is it hardware Agnostic?

This is a pretty simple test. SDN is, among other goals, intended to break vendor lock-in. A company pitching vendor lock-in as a part of SDN is clearly pretending.

Does it have an API or other control mechanism that’s faster than clicking on things?

This test is simple: if you have to use a GUI to control everything it’s not really going to allow for the advanced programmatic control you’ll need as your business grows.

Does it allow product people to forget about the underlying infrastructure?

This one is a little tricky. If perusing the product specs reveals many network focused items instead of deliverables (itemized server lists versus ‘I need X compute resources here’), chances are your product managers are wrestling with the network when they should be building products.

Again, ending vendor lock-in, accessibility and concealing complexity are the core tenets of SDN-goodness. If you’re missing one or more, chances are you’re not dealing with SDN.

Ok so now that we understand SDN, is it relevant?

If your audience is part of the Fortune 1000, chances are their IT department has a directive to pursue the cloud. At least half of the businesses I’ve worked with personally are looking at SDN. Although lower numbers have directives to pursue SDN, there’s still a groundswell of activity around SDN reminiscent of Cloud from 2010.

We now seem to be at a point of cultural acceptance for SDN. The market will likely press on in this direction for the next three to five years. During that time, marketing teams should position their technology to take advantage of these dynamics. Software Defined Networking is just a better way of managing networks. We’ll all benefit tremendously when true SDN is widely-implemented.

So is SDN all hype? Not from where I’m sitting. Bet on SDN hitting many more data centers in 2014.

Joshua Goldbard

282

9 Oct, 2013

Facebook’s vs Twitter’s Approach to Real-Time Analytics

Last year, Twitter and Facebook have released new versions of their real-time analytics systems.

In both cases, the motivation was relatively similar — they wanted to provide their customers with better insights on the performance and effectiveness of their marketing activities. Facebook’s measurement includes “likes” and “comments” to monitor interactions. For Twitter, the measurement is based on the effectiveness of a given tweet – typically called “Reach” – basically a measure of the number of followers that were exposed to the tweet. Beyond the initial exposure, you often want to measure the number of clicks on that tweet, which indicate the number of users who saw the tweet and also looked into its content.

Facebook’s vs Twitter’s Approach to real-time Analytics

Facebook Real-Time Analytics Architecture – Logging-Centric Approach:

Relies on Apache Hadoop framework for real-time and batch (map/reduce) processing. Using the same underlying system simplifies the maintenance of that system.

Limited real-time processing — the logging-centric approach basically delegates most of the heavy lifting to the backend system. Performing even a fairly simple correlation beyond simple counting isn’t a trivial task.

real-time is often measured in tens of seconds. In many analytics system, this order of magnitude is often more than enough to express a real-time view of what is going on in your system.

It is suitable for simple processing. Because of the logging nature of the Facebook architecture, most of the heavy lifting of processing cannot be done in real-time and is often pushed into the backend system.

Low parallelization — Hadoop systems do not give you ways to ensure ordering and consistency based on the data. Because of that, Facebook came up with their Puma service that collects and inputs data into a centralized service, thus making it easier to processes events in order.

Facebook collects user click streams from your Facebook wall through an Ajax listener which, then, sends those events back into the Facebook data centers. The info is stored on Hadoop File System via Scribe and collected by PTail.

Puma aggregates logs in-memory, batching them in windows of 1.5 seconds and stores the information in Hbase.

The Facebook approach puts a huge limit as to the volume of events that the system can handle and have significant implications over the utilization of the overall system.

Twitter Real-Time Analytics Architecture – Event-Driven Approach:

Unlike Facebook, Twitter uses Hadoop for batch processing and Storm for real-time processing. Storm was designed to perform fairly complex aggregation of the data that comes through the stream as it flows into the system, before it is sent back to the batch system for further analysis.

Real-time can be measured in milliseconds. While having second or millisecond latency is not crucial to the end user — it does have a significant effect on the overall processing time and the level of analysis that we can produce and push through the system. As many of those analyses involve thousands of operations to get to the actual result.

It is suitable for complex processing. With Storm, it is possible to perform a large range of complex aggregation while the data flows through the system. This has a significant impact on the complexity of the processing. A good example is calculating trending words. With the event-driven approach, we can assume that we have the current state and just make the change to that state to update the list of trending words. In contrast, a batch system will have to read the entire set of words, re-calculate, and re-order the words for every update. This is why those operations are often done in long batches.

Extremely parallel - Asynchronous events are, by definition, easier to parallelize. Storm was designed for extreme parallelization. Ultimately, it determines the speed level of utilization that we can get per machine in our system. Looking at the bigger picture, this quite substantially adds to the cost of our system and to our ability to perform complex analyses.

Final Words

Quite often, we get caught in the technical details of these discussions and lose sight of what this all really means.

If all you are looking for is to collect data streams and simply update counters, then both approaches would work. The main difference between the two is felt in the level and complexity of processing that you would like to process in real-time. If you want to continuously update a different form of sorted lists or indexes, you’ll find that doing so in an event-driven approach, as is the case of Twitter, can be exponentially faster and more efficient than the logging-centric approach. To put some numbers behind that, Twitter reported that calculating the reach without Storm took 2 hours whereas Storm could do the same in less than a second.

Such a difference in speed and utilization have a direct correlation with the business bottom line, as it determines the level and depth of intelligence that it can run against its data. It also determines the cost of running the analytics systems and, in some cases, the availability of those systems. When the processing is slower there would be larger number of scenarios that could saturate the system.

References:

Nati Shalom

728

12 Sep, 2013

TechCrunch Disrupts Battlefield Winner Announced!

On the final day of TechCrunch Disrupts, the final six startups duked it out in the Startup Battlefield—they were: Regalii, Ossia, Fates Forever, SoilIQ, Dryft, and Layer. Each of these teams took part in the Startup Battlefield earlier on against 23 other finalists and came out on top.

To determine the winner, the startups’ impact on the world are reviewed by a panel of judges consisting of: Michael Arrington (Founder of TechCrunch), Roelof Botha (partner at Sequoia Capital), Chris Dixon (Andreesen Horowitz), David Lee (founder of SV Angel), Marissa Mayer (CEO of Yahoo!), and Keith Rabois (Khosla Ventures). The startup that wins the Startup Battlefield will bring home the Disrupt Cup along with a $50,000 stipend.

Regalii’s goal is to maximize the efficiency of remittance. Their product functions through SMS; essentially, the sender can send an SMS to somebody and the recipient will get a pin number that works like a gift card. This is an improvement to current forms of remittance such as Western Union in that the recipient gets the money immediately—this means that, along with speedy money delivery, both parties minimize the deductions made by a third party. Thus far, the startup has experienced a growth of 67% per week, showing signs of success in the market. Regalii claimed that this product “is the future of global remittance.”

Ossia created a product named COTA. This device is the first of its kind as it is able to remotely send power anywhere within a 30 feet radius to devices and charge them. COTA will detect the device location via a secondary attachment and send power to and only to that location. A demonstration was shown in which a light bulb lit up when held near a prototype, but went out the moment it was moved—the consumer model is expected to be able to charge a moving device as well. Imagine a world where wires are no longer needed—everything will be charged wirelessly. That is the future that Ossia offers with COTA.

Fates Forever is a gaming startup. What makes it so special is that it proposes to bring a MOBA (multiplayer online battle arena) to the growing tablet market—this has never been done before. The touchscreen will add even more to the gaming experience. While some of the judges questioned the novelty (and, thus, the ability to get a large enough market) of reintroducing a popularized genre to the tablet, founder, Jason Citron, countered by stating that many successful games are essentially a reintroduction of previous games with a different spin. In this case, the use of a touchscreen and an under-saturated tablet gaming market will bring Fates Forever far.

SoilIQ aims to revolutionize the farming industry. They created a device that can be stuck into the ground where it will measure various aspects of the soil: pH, temperature, humidity, sunlight, etc. The device is solar powered and sends the data to the cloud, where it will be analyzed. The software will make crop suggestions based on the state of the soil. Additionally, when the parameters of the soil changes such that the crop will die, a message will be sent to the user’s phones. The company measured a 15% - 20% increase in crop yield, showing proof of concept. With this, crops can be grown in larger bulk, sold more cheaply, and, most importantly, can feed more mouths.

Dryft was the runner up of the TechCrunch Disrupts Battlefield. Their plan is to optimize texting efficiency for the tablet. The product is essentially a tablet keyboard. The on screen keyboard will take advantage of the accelerometer and touch screen to detect typing based on the vibrations made from tapping the screen. As a byproduct of easier typing, the front end (e.g. autocorrect) won’t be as messy, either. While there is a transition from notebook to tablet, this product can possibly push the rate of this shift even faster!

This leaves the winner of the TechCrunch Disrupts Battlefield, Layer. This startup offers a product to all app developers; a code that can allow them to easily and seamlessly integrate SMS, voice, and video communication to any app. Not only that, there is no phone to phone restriction—multiple devices can take part. While many other companies have tried this, they’ve only been able to succeed at the beginning stages, because of the underestimated difficulty of this undertaking, resulting in scaling problems. Layer could potentially maximize the efficiency of communications between all mobile devices.

All of these startups had amazing ideas with proof of concept—unfortunately, only one could win. This doesn’t mean the end of any of the other companies. A backstage chat said that, “as long as these startups don’t [mess] up, they will be very successful.” While Layer won, not everybody is in agreement with that decision. Who do you think should have won?

Dat Mai

379

24 Jul, 2013

Mac or PC or Linux? It Doesn't Matter

When you look around at all the software that’s being written afresh nowadays- without any legacy whatsoever, you’ll notice something - the product looks and feels and behaves exactly the same across all your devices and exhibits the same behavior on all operating systems. And most likely - it will be a web based application.

This behavior in software is remarkable because it used to take tons of work to make software behave this way merely a decade ago. When the web was still largely static and ‘software’ meant “application program that runs on your desktop”, there were lots of successful software products that would only work on a specific platform or an operating system. How did they become successful even though they had this limitation? Well, most of the world was running windows and the software was written to work with windows. So, they were adopted by the masses and users were happy. Meanwhile, the software makers made a lot of money.

Web 2.0 changed all of that- not overnight but it started the process of erosion. Erosion takes time. And it has taken time - almost 10 years. But erosion is powerful- it can reduce mountains to soil. Almost no one writes software that is targeted for the desktop anymore-unless it is for a highly specific problem and for a highly specialized domain. Since most start-ups do not target either, almost no software start-up makes desktop software. (Even if they do, they will have a web version.) Instead, they design software that runs on any machine, running any OS, and which works on all browsers. We will see this trend rise and its adoption by big companies will grow because of the way the web has become integrated into our lives.

This means that companies that have relied on operating systems for income will see a decrease in profits (read Microsoft) - unless they find other means. A bit less obvious, perhaps, is the fact that there is a huge potential for a start-up that owns the web application development platform. In other words, make it really easy for people to create software for the web, and you will be the winner.

This realization has already dawned on a couple of companies - Apple and Google most prominently, but they have been rather [intentionally?] narrow in their application of the idea. They have both created a closed ecosystem for developers- Apple with its App Store and Google with its Play Store. So, the developers are still making two versions of all their products. And Amazon is quietly trying to get a foot in the door with its Kindle e-book reader.

Enormous progress has been made in web development in recent years. JavaScript 6 is bringing a host of new features - websockets are finally coming to the browser, HTML 5 has started revamping the way web developers write code and server-side frameworks are making the lives of developers pain-free.

There have been some pivotal moments in the short history of software development (to quote Peter Siebel “It has lasted for less than one human lifetime”). The introduction of the first computer - the Eniac was one of them. Then came the punch-card driven computers, the introduction of the FORTRAN programming language, high-level language LISP, the microprocessor era, C and C++, Java and the web 1.0 in the 1990s and Ajax during the last decade. I believe this decade is the one in which in which web apps will really take off.

Hrishikesh S

657

28 Jun, 2013

More Messages, More Problems: How Businesses Can Send Fewer, Better Messages

IDEA IN BRIEF

1) Consumers suffer from message overload — and a lot of the messages we get are annoying and irrelevant.

2) This is not just a problem for consumers to deal with — it’s a problem for businesses. The more they annoy customers, the less likely people are to stick around.

3) Consumers keep adding new inboxes-email, SMS, push, social. For every channel, businesses have to add a new piece of message infrastructure.

4) But as businesses layer on new ways of sending messages, the new tools don’t coordinate with the old ones. This leads to crowded inboxes and a poor experience for customers.

5) The next generation of communication companies will do for businesses what designers did for webpages: cut down on the quantity of information, creating whitespace so that messages actually get noticed. This is what Outbound is building.

Inboxes are like Tetris

Think about your inbox. What comes to mind? For me, the image is a marathon game of Tetris. I can find better ways to deal with the inflow-I can install Mailbox, do some filtering, be disciplined about touching every message only once-but it just keeps coming. What would it take to replace Tetris with a different image?

The short answer is that businesses need to start thinking about messages as extensions of their product instead of one-off blasts. The end-result would be fewer, higher-quality messages-and a lot more whitespace around key information. We need to do this for the inbox:

In order to understand how this would work, let’s take a step back and look at how businesses send messages now.

Why the messages just keep coming

The mold was set with email, which has become the universal channel. We get messages from mom alongside flight details and promos from The Gap. As long as there are 1-to-1 messages from people we know waiting for us, we’re not going to stop checking our inbox, and that makes it a very attractive channel for businesses. As a result, the 1-to-1 messages that draw us to email in the first place make up a smaller and smaller portion of our inbox, and we spend more time sorting and less time consuming information that matters.

But that was only round one. Just as email became ubiquitous and started to reach a saturation point, we all got mobile phones. We started off texting our friends, but it wasn’t long before businesses realized that by collecting our mobile number they gained a new channel-inbox number two. We soon upgraded to smartphones with app stores, and we added a third inbox for push notifications.

As each of these channels approaches saturation, it doesn’t go away. New channels just get stacked on the preexisting channels and find new ways to compete for our attention (I’m looking at you, badge app icon!)

Our inboxes didn’t fill up on their own-it took several generations of new infrastructure to enable businesses and nonprofits to message us so steadily and efficiently. Here’s how it got so easy to send messages:

Generation 1: Age of the email service provider

It used to be that when you wanted to email a group of people, you had to type a lot of email addresses into the “To” field. Sure, you could set up a listserv, but that was the domain of the webmaster/admin technical guru, not a task for the common man. And even the listserv was a pretty basic tool. But when Constant Contact, MailChimp, and a dozen other email service providers (ESPs) came along, they not only democratized the listserv, but also made it a lot more powerful by adding open and click analytics, scheduled messages, unsubscribe tools and list segmentation.

Over the past fifteen years, at least six ESPs have grown up to be worth more than $100 million by making these features accessible. These companies opened up the playing field by allowing anyone to do email marketing. As ESPs grew, the best ones realized that they needed to protect end-users as well as serve their customers. The leading ESPs enforce strict opt-in and unsubscribe policies and spend a lot of time educating customers about the right way to do email marketing. MailChimp leads the market (folks on Quora estimate that MailChimp may be worth as much as $1 billion) in large part because of its attention to the kind of email experience its customers are creating for their contacts.

But even ESPs that respect our inboxes can’t manage the new flood of messages from the product across email, text and push.

Generation 2: Rise of the transactional message

As the market for ESPs matured, transactional message providers like SendGrid began offering tools that let developers outsource some of the email drudgery involved in delivering transactional email. Developers still had to define the business logic to trigger receipts, invoices, app content (like new comments) and other product-based emails, but they no longer had to worry about delivery.

Easier delivery seems like an incremental improvement, but it opened up a flood of transactional email: last year SendGrid reported that “web applications are sending an average of 631,000 emails per month, and nearly 50 percent of user actions in web applications trigger an email alert.”

And that’s just email. Twilio is building an empire around SMS and voice delivery, and Urban Airship is doing the same for push notifications. Both of these companies tout better engagement compared to increasingly lackluster email open rates. But as these channels saturate, their engagement rates will erode the same way that email has.

Where does that leave us?

Businesses are sending more messages using more channels than ever before in an effort to capture our attention. Consumers battle a constant flow of messages coming at us across channels, constantly sorting and skimming to separate the signal from the rising noise. We’re all stuck playing a frantic game of Tetris with our messages.

The next generation of message tools will create… whitespace

How did we move past cluttered websites toward beautiful pages with abundant whitespace and intuitive user experiences? We did it by stepping away from our keyboards long enough to listen to users. What do they think of the page? Where do we lose them?

Message tools need to create whitespace around the messages businesses send in order to help people stop and pay attention. Here are some ways to achieve this:

Begin with the user’s context, not the channel. Most businesses start by deciding which channels they will use to send messages, then gradually fill those with messages to push content out. Instead, start with the context: where and when will this person receive the message? Is the call to action (even if it’s just to absorb information) reasonable given this context? A lot of this involves A/B testing-quick iterations through trial and error-but some of it is common sense.

Pay attention to the relationship. If you’re barraging customers with messages or leaving them in the dark, it doesn’t matter how interesting the content is; those people are probably too annoyed to pay attention. A single, unified view of all the messages going to each user across channels and departments allows you to see when you are over- (or under) whelming people.

Give users a say. Analyzing user data helps address both of the points above, but you’re never going to get the full picture of a user from analyzing her data. Just as user testing and feedback often yield surprising insights about how to make a webpage better, we need to gather feedback on messages that is much more fine-grained than an on/off unsubscribe button.

Where to start?

In small companies, these principles produce slight differences in messaging, but over time those differences multiply as message volume and organizational complexity grows. It will take time to build really good solutions to these problems. My company, Outbound, has begun with three simple innovations:

1) We separate message content from channel so that any information can be sent, using any channel, based on the actions that users take. That leaves your options open depending on the context for your message.

2) We log all outbound messages in one place, providing a unified view-the same view your user has of her inboxes.

3) We make automated messages code-free because it’s impossible to iterate quickly based on trial and error if the person who is writing the messages is always waiting for engineering to make edits or change business logic.

This just scratches the surface of what we need to help businesses create fewer, higher-quality messages. But that’s why we’re so excited about this space. Imagine if instead of optimizing for more clicks, companies had a tool that optimized to make you totally satisfied-and kept changing their message mix until you signaled that you were happy. That’s the inbox I want.

Josh Weissburg

13449

24 Jun, 2013

3 Tactics for Freemium Mobile Games and Development

One of the greatest benefits the data-driven development paradigm affords a free-to-play mobile studio is the ability for users to communicate their preferences through actions, rather than words: analytics serves as a conduit through which feedback is transmitted between the player base and the developer, allowing games to be improved upon and optimized for user taste, not developer intuition or legacy insight.

But game development is an artistic endeavor, and some producers – who are artisans, by any reasonable standard – bristle at the thought of algorithmic, hyper-incremental game design. While game development has become increasingly informed by analytics, very few studios instrument the subjective aspects of design: aesthetic consistency, thematic appeal, game controls, etc. Terms like “juiciness” continue to pervade the industry.

As well they should. But just as no free-to-play game should be developed without input from a human, no free-to-play game should be developed in an intuition bubble. Freemium product development is agile, iterative, and data-driven by nature; what follows are three tactics for effectively introducing elements of data-driven development process without sacrificing creative discretion.

Define done

One of the concepts at the core of agile software development is the “definition of done”: product feature specs must be produced in sufficient detail to allow the team to determine when the development process can be concluded.

Defining done for software that exists to serve a discrete, objective purpose – say, translating a page of text from one language to another – is, by virtue of being measurable, easy to do. Defining done for an art asset, or a game design document, or a game economy model, is much more difficult: subjective value varies from person to person, and the state of progress of fundamentally immeasurable artistic undertakings is fluid, not binary (“done” or “not done”).

Defining done is important, but the process of defining done is so difficult — and the repercussions of not enforcing it are so dramatic — that many teams avoid ever doing it. The truth is that, no matter how well a team of people work together, at some point decision-by-committee becomes impractical; defining done must sometimes fall on the shoulders of the producer as a unilateral determination.

When this is undertaken at the beginning of a well-defined sprint, based on thorough product specs that have been communicated clearly, then conflict is easy to avoid. But when a unilateral decision is made as a sprint is drawing to a close, based on nebulous, imperious intuition, not only is conflict unavoidable but so too is development crunch.

Avoid not knowing

Intuition can be sticky; human beings are capable of clinging to preconceived notions for far longer than those notions hold true (if they ever did). Part of the liability introduced into free-to-play games development by producers with legacy experience in shipping console and desktop games is that user preferences can be subverted by an intractable and wholly anachronistic worldview.

The unknown can only be elucidated by data, and the more recent that data, the better if reflects the realities of a rapidly evolving marketplace. A/B tests are not a panacea, but they are essential in free-to-play game development – not only to shed light on preferences for entirely new mechanics but to reify previously-tested design decisions.

Which begs the question: can a producer that isn’t familiar at even the most basic level with common analytics frameworks and data-driven design techniques be effective in a free-to-play environment? Free-to-play games can be improved upon in-market using a panoply of metrics, but doing so requires a skillset that wasn’t previously relevant: a basic understanding of quantitative methods and the technology infrastructure required to test.

Promote transparency

Iterative development is undertaken because customer needs are never static: they change and evolve over time, and products that don’t adapt as customers do are left for dead in the marketplace.

A second core component of agile software development is the stand-up meeting: the entire team congregates in the same room and announces what they accomplished yesterday, what they plan to accomplish today, and whatever obstacles they face in completing their work. Complementing the daily stand-up meeting is the burndown chart, where the tasks allocated to the sprint are graphically arranged in such a way so as to visualize the work remaining.

An additional element of transparency that benefits the free-to-play development process is the metrics billboard: a highly visible, centrally located television in the development area that cycles through the game’s relevant metrics post launch. The metrics billboard not only prevents team members from becoming ignorant of the game’s performance, but it underscores the studio’s dedication to data-driven development.

This transparency is important: it helps team members understand development bottlenecks and prevents crunch. But perhaps most importantly, transparency provides the team with the opportunity to collaborate and quickly unify its efforts in response to data.

By promoting transparency, the team allows development shortcomings to be addressed immediately and convenient yet ultimately ineffective design strategies to be dispelled on the basis of objective analysis.

As a strategy

Defining done, avoiding ignorance through testing and measurement, and removing opacity from the development process contribute to an environment in which data reinforces creative insight without supplanting it. Data is the lifeblood of the freemium model, but it’s not prescriptive – the way in which data is used to improve upon the product and bring delight to consumers is an exercise which is enhanced by experience and insight.

Eric Seufert is a quantitative marketer with a specific interest in predictive analytics and quantitative methods for freemium product development. He blogs frequently at ufert.se (link: ufert.se), and his forthcoming book, Freemium Economics (link: freemiumeconomics.com), will be published by Morgan Kaufmann in early 2014.

Eric Seufert

277

Category: Algorithms

There are 10 posts published under Algorithms.