Computer science is changing: the amount of data available for processing is growing exponentially, and so must the emphasis towards its handling. Like the 19th century change in physics from mechanics to statistical mechanics, the new algorithms sacrifice the precision of a unique answer for the fast search of statistical properties. The following draft of a book by Hopcroft and Kannan breaks the path of what most future algorithms manuals may look like:
Heavy on proofs, many topics have been selected for their mathematical elegance, not their pragmatism. On the final version of this much anticipated book, I would love to see more content on hash algorithms, parallel algorithms, graph spanners or a more extensive discussion on Support Vector Machines.
2013 is coming to an end, and many papers have been published this year. My recommendations for the next holidays follow: deep, imaginative, mixing the practical and theoretical, and perfect to inspire hope for a better future.

A Simple Algorithm for the Graph Minor Decomposition. A quadratic time algorithm for computing graph minor decompositions is presented, improving from the previous cubic lower bound. It’s very intuitive and easy to understand, so I bet it will appear on future books on algorithms, turning it into a classic.

Dynamic graph connectivity in polylogarithmic worst case time. The classic problem of whether there is a path between two nodes on a dynamically changing graph has a polylogarithmic algorithm: a theoretical advance with multiple nice applications in practice (routing, transportation, robotics…)

Ambient backscatter: wireless communication out of thin air. Imagine devices communicating wirelessly using ambient radiofrequency as the only source of power, by backscattering ambient RF signals: the Internet of Things (IoT) must surely run over this in the future.

Automatic patch generation learned from humanwritten patches: although the rate of successfully generated patches is still quite low, it’s interesting to learn that there are progresses on this frankly quite boring area of software maintenance.

F10, A FaultTolerant Engineered Network: a novel way to successfully attack the commonplace problem of network reliability and performance in datacenters by reexamining all the involved parts: network topology, routing algorithm and failure detection.

Pinocchio, Nearly Practical Veriﬁable Computation: amazing improvements in the generation of proofs of general computations and their verification, rendering into the practical this basic necessity of the next cloud computing landscape. Combine this with Pepper and you’re surely gaining knowledge of the most important advances of this year on the future of cloud computing.
 The library of George B. Dantzig is for sale: peering over the libraries of great mean is as enlightening as their work.
 Benford’s Law and the Art of Succeeding in Multiple Choice Tests: funny application of Benford’s Law.
 Polymath8: I’ve been following this project on collaborative mathematics and the article on Wired is a good description of its course.
 Rota’s conjecture proved: well, almost proved, there are some problems with the inequivalent representation of matroids, but it’s almost here.
After Edward Snowden’s revelations had been publically discussed for months, I can only claim: did nobody see this coming? Because there is a clear precedent: NSA’s Project SHAMROCK collected all telegraphic data entering or exiting the United States from 1945 to 1975. And more recently, the gargantuan scale of the Great Firewall of China was an omen of what more advanced countries could built to spy on its citizenship.
Privacy will be gradually eroded, being a weak private right and a strong public wrong whenever facing the presumption of illegalities, and prosecution laws are already being modified to be more accepting of evidence gathered through warrantless surveillance programs, especially in commonlaw countries where police forces have ample power to interpret the law. And if in the future all surveillance is perfectly automated with almost no human intervention, who would really care of the invasion of privacy if done by amoral and unsentient machines?
What I really find fascinating is that Snowden’s revelations haven’t brought us any advanced technology: it’s almost like the NSA didn’t have any real technology edge over current commercial technologies, which I don’t really buy into. Meanwhile, the private sector develops and markets some technologies like PredPol for realtime crime prediction: an excellent predictor of what’s to come.
 Statistical Basis for Predicting Technological Progress: excellent paper on how easy is to forecast technological progress for some selected technologies (the main problem now being the identification of those technologies).
 Amdahl’s Law vs. GustafsonBarsis’ Law: it contains an interesting insight on the impact of parallelization in relation to power consumption.
 Moore’s Law in Photonics: mass diffusion of photonics is nearer than anyone else could think of.
 Predicting the Path of Technological Innovation: this paper introduces a simple model that works quite well across markets.
 The Economics of LongTerm Digital Storage: comparison and forecast of the different storage technologies.
Cloud Deployment Manager is the first and only tool to automagically deploy Microsoft clouds from a diagram, with just the click of a button: it represents the cutting edge of DevOps research and innovation on the Windows platform.
It’s amazing what can be achieved to improve how systems are currently architected with the right tools and some code.
 Rooting SIM cards: the latest research of Karsten Nohl on GSM security (as if the typical problems of cellular encryption weren’t enough)
 The early history of computing to automate cryptanalysis (declassified)
 TLS forward secrecy has been proposed as the easiest solution to PRISMlike attacks, but it may be very easy to botch
 Buggy and poorly implemented protocols are the quickest path to break any encrypted voice protocol, as the recent weaknesses in ZRTPCPP shows
 DNSCrypt: I remember playing with a similar tool a decade ago, but this one is incredibly advanced
I’ve just ended these three books published this year on the intersection of finance and programming:
 C# for Financial Markets. This recently published book is just the translation to C# of all the previous books by the same author, especially the ones on the intersection between finance and C++. As such, onethird of the book delves into the implementation of basic mathematical finance (bonds, futures, options, swaptions, curve and surface modelling, finite difference method, …) and twothirds of the book delves into teaching the C# language and its interoperability with Excel/LINQ/C++: note that if you’re already a pro on C#, you’ll better skip these parts since they are far from being the best authoritative source, although the sections on interoperability are really instructive. The best point of this book is that really full of examples to enlighten every concept (850 pages long!), although they never manage to compose a full application of any financial worth (that’s left as an exercise to the reader!): thus, and only on this technology angle, it’s the best book for beginners.
 Financial Modelling – Theory, Implementation and Practice (MATLAB). Do you need a book to quickly acquaint yourself with the stateoftheart in financial mathematics for asset allocations, hedging and derivatives pricing, skipping the study of dozens of papers? Then this book is your definitive shortcut: it encompasses all from the derivation of the models to their implementation in Matlab, demonstrating that this language can also be used for prototyping purposes in the financial industry, efficiency and interoperability aside (if you don’t know Matlab, a third of the book delves into that). My favourite part of the book it’s the first one on models (stochastic, jumpmodels, multidimensional, copulas) that reads lightly and fast in just an afternoon, but the book is also overfocused on the numerical implementation of the models (a third of the book), when most of these details are just left to some library in the real world. Even so, just running over all its examples is worth its full price.
 Financial Risk Modelling and Portfolio Optimization with R. R is the lingua franca of statistical research, and its underutilization in the financial industry it’s a real puzzle, the truth being that the sheer number of packages dealing with every imaginable statistical function should be enough to justify a much deeper penetration into daily use. This book is best suited for quantitative risk managers, and it surveys the latest techniques for modelling and measuring financial risk, besides portfolio optimisation techniques. Every topic follows a precisely defined structure: after a brief overview, an explanation is offered and a very interesting synopsis of R packages and their empirical applications ends the discussion. My favourite parts are at the end, on constructing optimal portfolios subject to risk constraints and tactical asset allocation based on variations of the BlackLitterman approach.
Charles Handy divided the management cultures in four types on his now classic book on business administration, Gods of Management:
 Zeus culture, a club culture with centralized power in which speed of decisions and accountability are prioritized over any other virtue
 Apollo culture, a rolebased culture with many rules and procedures, in which stability and predictability are prioritized, creating resistance to change
 Athena culture, a taskbased culture focused on providing solutions to concrete problems, ideal in times of expansion when new products and technologies get introduced
 Dionysus culture, an existential culture in which the organization exists for the sole purpose of helping the professionals to achieve their personal goals
In the last reorganization, Microsoft has shifted from an Athenalike culture to an Apollolike culture, a typical change in business reorganizations also predicted by Handy in his book. However, the general consensus is that this move towards this kind of management culture is inferior, as shown in multiple studies:
 Managerial overload and organization design: a divisional organization is more efficient than the functional organization when the number of projects is large, and Microsoft has 13 products and services with revenues of more than 1 billion dollar per year.
 Organization Design: different management structures are compared based on the internal coordination costs imposed on the firm, very relevant to the Microsoft reorganization
 A double moral hazard model of organization design: a model based on moral hazard to explain why suboptimal business structures are chosen is introduced, starting from general fact that the divisional structure is more efficient than the functional structure.
Only the malleable nature of software and the command to deploy a common software platform on any device and any type of screen may outstrip the previously presented disadvantages: be that as it may, MSFT is pursuing a muchneeded goal for the whole computer industry.
 Turing award goes to Goldwasser&Micali: zeroknowledge, semantic security and universal forgery under chosen message attack.
 The DDoS That Almost Broke the Internet: attacking whole Internet eXchanges just to bring down SpamHaus.
 Cybersecurity: The digital arms trade: price quotes for exploits, The Economist’s style.
 NSA Cryptologs: was this really classified? Many opinions and littletonone noteworthy facts.
 RC4 as used in TLS/SSL is broken: see page 306. More here.
Elasticity and scalability via dynamic provisioning of servers and other resources is one of the tenets of cloud computing: in practice, being able to automatically grow the resources available to an application based on the level of demand is actually eased with patterns like WASABi, the Autoscaling Application Block for Azure.
But even if an application can automatically grow, a question is still unanswered: what is the minimum number of servers that should be reserved? And that is a complex question that requires trading off the losses of being unable to properly service requests because booting up instances takes some minutes, and the actual costs of those instances that may rest idle with no load.
Let’s consider a service with low margin (10%) and a really intensive computation (8 requests per second per Azure XL server) were the probability that a customer does not enter the service due to congestion is 1% and a much higher probability of 2.5% for those leaving the service after a request: to model the whole system an M/M/s queue is considered, that is, s servers experiencing a Poissonlike input process and an exponential service time, with an unlimited waiting queue. To solve the proposed queue model, R code and its resulting plot as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47  # Minimum number of roles to calculate minRoles < 4 # Maximum number of roles to calculate maxRoles < 12 # Probability of not entering service due to highload probNotJoin < 0.01 lossNotJoin < 0 # Probability of leaving the service due to highload probLeaving < 0.025 lossLeaving < 0 # Revenue per request customerRevenue < 1 # Profit per request customerProfit < 0.1 # Azure XL web/worker role ($0.96/hour) costServer < 0.96 # New requests/hour arrivalRate < 90000 # Requests/hour serviced per role serviceRate < 30000 serviceCost < 0 totalCost < 0 library(pdq, help) for(r in minRoles:maxRoles) { Init("") CreateMultiNode(r, "server", CEN, FCFS) CreateOpen("requests", arrivalRate) SetDemand("server", "requests", 1/serviceRate) Solve(CANON) queueLength < GetQueueLength("server", "requests", TRANS)  arrivalRate/serviceRate waitingTime < 60 * queueLength/arrivalRate numNotJoin < queueLength * probNotJoin * arrivalRate numLeaving < waitingTime * probLeaving * arrivalRate serviceCost[r] < r * costServer lossNotJoin[r] < numNotJoin * customerRevenue * customerProfit lossLeaving[r] < numLeaving * customerRevenue * customerProfit totalCost[r] < serviceCost[r] + lossNotJoin[r] + lossLeaving[r] } plot(totalCost, xlim=c(minRoles,maxRoles), type="b", col="black", lwd=2, main="Optimal Elasticity of Demand", xlab="Servers (m)", ylab="Cost ($)") lines(serviceCost, xlim=c(minRoles,maxRoles), type="b", col="brown", lty="dashed") lines(lossNotJoin, xlim=c(minRoles,maxRoles), type="b", col="green", lty="dashed") lines(lossLeaving, xlim=c(minRoles,maxRoles), type="b", col="orange", lty="dashed") 
The black curve represents the total costs: the sum of the server costs (brown curve) and losses due to not entering the service (green curve) and leaving the service (orange curve). An optimal reservation of servers is found at s=8, the minimum of the total curve costs.
Archives
 February 2017
 April 2014
 March 2014
 December 2013
 November 2013
 July 2013
 April 2013
 March 2013
 February 2013
 January 2013
 December 2012
 November 2012
 October 2012
 September 2012
 August 2012
 July 2012
 June 2012
 May 2012
 March 2012
 February 2012
 January 2012
 December 2011
 November 2011
 October 2011
 August 2011
 July 2011
 June 2011
 May 2011
 April 2011
 March 2011
 February 2011
Blogroll
 ACM Queue
 Asymco
 Bentham's Gaze
 Bristol Cryptography Blog
 Computational Complexity
 Cryptographic Engineering
 ECRYPTEU
 Elliptic News
 ePrint IACR
 Financial Times Tech Blog
 Gödel's Lost Letter and P=NP
 IEEE Spectrum
 ImperialViolet
 InfoQ
 Jefferson's Wheel
 Lessons Learned
 Light Blue Touchpaper
 Marginal Revolution
 Musing on Markets
 Nature
 Nick Higham
 Outsourced Bits
 Quanta Magazine
 Renesys Blog
 Schneier on Security
 Science
 Telco 2.0
 The Economist
 The Leisure of the Theory Class
 The morning paper
 Tomasz Tunguz
 Turing’s Invisible Hand
 Unenumerated