Booz Allen Commercial delivers advanced cyber defenses to the Fortune 500 and Global 2000. We are technical practitioners and cyber-focused management consultants with unparalleled experience – we know how cyber-attacks happen and how to defend against them.
Our strategy and technology consultants have empowered our international clients with the knowledge and experience they need to build their own local resources and capabilities.
In facing challenges of modernization, our Middle East and North Africa clients have complex requirements that benefit from our proven experience in guiding major programs and projects for governments and private-sector organizations. The services we offer in UAE, Qatar, Egypt, Turkey, Kuwait, Morocco, Jordan, and other regional countries build on our consulting legacy.
Our clients call upon us to work on their hardest problems—delivering effective health care, protecting warfighters and their families, keeping our national infrastructure secure, bringing into focus the traditional boundaries between consumer products and manufacturing as those boundaries blur.
Booz Allen was founded on the notion that we could help companies succeed by bringing them expert, candid advice and an outside perspective on their business. The analysis and perspective generated by that talent can be found in the case studies and thought leadership produced by our people.
Learn more about Booz Allen's diverse culture and environment of inclusion that fosters respect and opportunity for all employees.
We've come a long way delivering innovative solutions. But our next chapter is still being written.
Our 22,600 engineers, scientists, software developers, technologists, and consultants live to solve problems that matter. We’re proud of the diversity throughout our organization, from our most junior ranks to our board of directors and leadership team.
Booz Allen takes pride in a culture that encourages and rewards the many dimensions of leadership—innovative thinking, active collaboration, and personal service. We’re particularly proud of the diversity of our Leadership Team and Board of Directors, among the most diverse in corporate America today.
February 20, 2015
Written by Edward Raff
Support Vector Machines (SVMs) may not be as popular as Neural Networks within data science, but they act as powerful, useful algorithms. One of the difficulties of SVMs has been the computational effort required to train them. However LIBSVM, which has been used for over a decade, can fairly easily handle the 30,000 training points in the National Data Science Bowl competition’s data set. That makes SVMs a viable tool for you to use both in general, and for the purpose of competing in the National Data Science Bowl.
At their heart, SVMs find a linear solution to the problem presented from the data set, similar to the long-used Logistic Regression and Linear Regression. The notion behind what the SVM does differently is that it tries to maximize the distance between the classes involved. This is also known as margin maximization. The intuition being: If you find the line that maximizes the distance between two classes, it’s more likely to generalize well to unseen data.
What makes SVMs in data science truly interesting is what is known as the “kernel trick.” This allows one to have the algorithm transform your input features into a different feature space and find a linear solution, which may be non-linear in the original dimension. The form of a kernel trick is K(x,y) where x and y are two features, and it outputs a single, real value where larger values indicate similar and smaller values with less similarity betweenx and y. For example, if you wanted to include the interactions of every combination of two features, you might create extra features that represent the multiplication of those features together. This is equivalent to creating the degree-2 polynomial of the features, which the SVM can do as a kernel trick. However, since the SVM can do this without actually forming the features, you can raise the degree as high as you like and it will still work (if you tried this yourself you may very quickly find yourself unable to learn to model or even fit the data in memory).
Another commonly used kernel is the RBF kernel, and is probably the most widely used kernel. It has the nice property that using it with an SVM can be interpreted as a type of smarter nearest-neighbor search.
Continually, the SVM combined with Kernels have some very nice mathematical properties—one of which is you can add or multiply kernels together to get a new, valid kernel. We can use this to incorporate different types of features more elegantly into a single model.
In the case of the National Data Science Bowl data set, we could use a combination of three kernels—one for each feature type as Aaron Sander (one of my fellow data scientists) suggested in his post (say k1, k2, and k3). If I noticed spatial features (k1) tend to only give low scores when two inputs were definitely different classes, I could defineK(x,y) = k1(k2 + k3). That way, when spatial features indicates a low match, it will strongly discourage the algorithm—even if the other set of features thought it was viable. I could even add some extra knobs to tune, making K(x,y) = k1(c2 k2 + c3 k3) allowing me to favor one set of features as better than the others.
This type of feature combination is especially useful for a data set that cannot be represented as a fixed-length feature vector. Kernels can be defined directly for text data, graphs (like the connections in a social network), and lots of various structured problems. By using a different kernel for each feature, we can use SVMs on more types of information than other algorithms can, and we can use them all simultaneously.
Hopefully, this has motivated you to explore the SVM as a potential option for you to solve this, and future, problems using a complex data set. For more information about the LIBSVM, the authors have a short guide on how to use their software, which includes good advice on using SVMs in general. It’s easily available in python (scikit-learn), R (the e1071 package), Java (Weka), and has been ported on its own and wrapped into numerous other programming languages and libraries as well.
Feel free to talk or ask me questions @EdwardRaffML. Good luck!