On getting rejected

One of the most frustrating things when starting a career in academic research is getting your paper rejected.

Even with more experience, no one enjoys a rejection and we all prefer to see good news.

Quite often this happens on the first project you tackle. A new student might work for a year on a project, pouring in effort and passion and resulting in something that seems to have real merit ... only to be hit with a cold, hard rejection.

And then, with the computer science conference submission schedule, you have no opportunity to respond to the reviewers and you might have to wait six months or more for another appropriate chance to submit. Science is a slow process.

But there's one bit of silver lining: A paper's rejection doesn't mean your research is bad.

In fact, many or most of the papers you see in their final polished form in top venues went through rejections -- even the best papers. Here's a thought experiment. Among a set of published papers, some will have gotten in on the first try, some were rejected first, others were rejected multiple times. Ultimately, how impactful are the papers that got in on the first try, compared to the rejected ones?

To answer that let's (imperfectly) measure impact in terms of citations. Here are my own published research projects, showing the number of times the project received a rejection (X axis), and the number of citations per year (Y axis), as of a few weeks ago:

First you'll note that most of my projects have been rejected, either with a failed workshop, conference, or journal submission, before reaching successful publication later. And furthermore, among these published papers, there is apparently no correlation between whether the project has been rejected, and its eventual impact. In fact, if we were to draw a best fit line:

...then we see that my published projects have received 3.96 more citations per year, per rejection. Not bad considering the average number of citations per year here is 13.4. This is not a very robust method given the small sample and skewed distribution of citations. But a 2012 study by Calcagno et al of 923 journals similarly showed that rejected papers received "significantly" more citations than those accepted on the first submission.

This might be counterintuitive. Doesn't a rejection mean that the project is less exciting or less well executed, which certainly imply lower impact? Perhaps, but there are at least two other factors:

  • A rejection can improve the quality of a project's next submission, due to feedback from peer review and time to improve the manuscript.
  • Authors might judge, based on the rejection, to not bother resubmitting. These dead manuscripts dropped out of my sample.

Here's what I think this says: You should let your best judgement, not the reviewers' decision, guide your research. Certainly you should give careful consideration to how reviewers reacted to your paper, but don't automatically take a rejection as an indication of the quality of a project. If you still believe that this thing is a good thing, then you are probably right. It is a good thing, with at least as much impact potential as an immediately accepted paper.

OK ... now go have some ice cream.

Proposal: CoolNets 2014

The SIGCOMM workshop proposal deadline is coming up. But the community already has many workshops devoted to popular hot topics and multiple sessions on SDN at every top networking venue. Here's my proposal.

CoolNets 2014: The First Workshop on Cool Topics in Networks

The First Workshop on Cool Topics in Networks (CoolNets 2014) seeks papers describing totally groovy contributions to the field of computer communication networks that are not related to presently anointed hot topics. We invite submissions on a wide range of networking research, including, but not limited to, results which are awesome, neat-o, nifty, keen, swell, rad, da bomb, badass, and slick as a greased herring in a butter factory, while being refreshingly orthogonal to:

  • software-defined networking
  • cloud computing
  • content-centric networks
  • virtualization of networks, network functions, and machines
  • big data

Submissions on the above topics will be held without review for five years, and then fast-tracked to a prestigious journal unless IP-based QoS has achieved 80% deployment. Such submissions are considered not cool, dude.

Strong submissions may break ground in new research directions, reveal insights about longstanding problems, develop alternative solutions whose clean design will have lasting value, forge connections across fields, deconstruct misconceptions, contribute solid technical advances in important areas, or build networked systems that are just pretty darned cool.

Multi-part series of totally tubular papers will be considered.

Creating research ideas over time

Research is built on ideas: identifying questions to investigate, problems to solve, or new techniques to solve them. Before I started as faculty one of my biggest doubts was whether I would have good enough ideas to lead a research group and help shape five or six years of each student's career. There is no deterministic procedure to sit down and generate an idea.

However, we can think about how to improve the conditions for them to pseudorandomly appear.

Since sometime in my first year of grad school (2003), I've kept a document logging ideas for research projects. The criteria for including an idea was simple and has remained, at a high level, fairly consistent:  When I have an idea that I think would have a reasonable chance at leading to a publishable paper, I jot down a description and notes. This is useful to help remember the idea and the document is also a convenient place to record notes over time, for example if I notice a related paper months later.

Having grown over almost exactly ten years to 169 entries, this document is now an interesting data set in its own right.

The data

Of course, this is not a uniform-random sample of ideas. There are various biases; not every idea makes it into the document, my inclusion standards might have changed over time, and so on. And many of these ideas are, in retrospect, terrible. But let's take a look anyway.

Here is the number of ideas in the document per year. (The first and last were half years and so the value shown is twice the actual number of ideas.)

Now let's probe a bit deeper. The number of ideas might not tell the whole story. Their quality matters, too. To investigate that, I annotated each idea with whether (by 2013) it successfully produced a published paper. I also tagged each of the 169 ideas with an estimate of its quality in retrospect (as subjectively judged by my 2013 self), using a scale from 1 to 10 where

  • 5 = dubious: maybe not publishable, or too vague to have much value
  • 6 = potential to result in a second-tier publication
  • 8 = potential to result in a top-tier publication (e.g. SIGCOMM or NSDI in my field)
  • 10 = potential to result in a top-tier publication and have significant additional impact (e.g., forming the basis of a student's thesis, producing a series of papers, a startup, etc.)

The number of reasonably high quality ideas and the number that produced papers both show significant jumps in 2008-2009, though with different behavior later.  Also, the plot below shows perhaps a gentle increase in the mean idea quality over time, and a bigger jump in the quality of the top 5 best ideas in 2008. Note that even a one-point quality difference is quite significant.

The most prominent feature of this graph is an enormous spike in number of ideas in the 2008-2009 timeframe, and a corresponding increase in higher-quality ideas. What might have caused this? And what can we conclude more generally?

Ideas need time for creative thought

A significant change in my life during 2008-2009 was my transition from PhD dissertation work to a postdoc year (split between working on post-dissertation projects at my PhD institution of UC Berkeley, and working with Sylvia Ratnasamy at Intel Labs).

This appears to show the value of having time to step back and think -- and also the opportunity to interact with a new set of people. By May 2008, I was largely done with my dissertation work (though I did work on it more later and finally graduated in May 2009). I had accepted a position here at the University of Illinois and deferred for a year. So I was largely free of responsibilities and concerns about employment, and had more time to be creative. While there are reasons to be concerned about the surge of postdocs in computer science, I think this indicates why this particular kind of postdoc can be extremely valuable: providing time and space for creative thought, and new inspiration.

If that is the explanation, then it seems I was not sufficiently proactive about creating time to be creative after the flood of professorial tasks hit in late 2009.

There are alternative explanations.  For example, knowing that I was about to enter a faculty position, I might have more proactively recorded ideas for my future students to work on. However, that would not explain another observation -- that my creative expression at this time in other areas of my life outside computer science seemed to increase as well.

John Cleese has argued that creativity takes time, and it's more likely to happen in an "open mode" of playful, even absurd thought, rather than in a "closed mode" of efficiently executing tasks:

His talk makes other points relevant to academic research. In particular, you are less likely to get into an "open mode" of thought if you are interacting with people with whom you're not completely comfortable.  This should certainly affect your choice of collaborators.

It's worth noting that having time to enter an "open mode" of creative thought does not mean that one is thinking free of any constraints whatsoever. I personally find that constraints in a problem domain can provide some structure for creative thought, like improvising around a song's fixed chord changes in jazz.

Ideas need time to germinate

In fact, some ideas need years.

You'll note from the second plot above that the number of paper-producing ideas is zero in 2012 and 2013. This is not just random variation: It's actually fairly unlikely to have an idea and immediately turn it around into a paper.  In fact, it has happened fairly often that an idea takes a year or two to "germinate".  I might write down the seed of an idea, and at that time not recognize whether it is valuable and what it might become.  In coming back to it occasionally, and combining it with other ideas, and bouncing the idea off other people, the context and motivation and focus of the idea gradually takes shape until it is something much stronger and which I can recognize as a worthwhile endeavor.

And that is all before the right opportunity appears to begin the project in earnest -- such as a PhD student who is looking for a new project and is interested in the area -- and the project is actually developed and the paper written and submitted (and resubmitted ...) and finally published. The most extreme example I've been involved with was a 2005 idea-seed that was finally published in a top conference seven years later. In fact, in processing this data I realized there was a second idea from 2005 which lacked sufficient motivation at the time and got somewhat lost until 2011 when it combined with a new take on a similar idea that grew out of a student's work and was published in 2012. The plot below shows 14 lines, each corresponding to a project, with points at the inception year of the seed idea, intermediate ideas if any which combined with it, and finally the year of publication.

Ideas from connections

Reading over the document made it clear that very few if any of the ideas sprang from out of nowhere. They come from connections: with a paper I read, or with a previous project, or in chatting with collaborators. Some of these connections can be quite unexpected. For example, one project on future Internet architecture indirectly inspired a project on network debugging.

Many of the ideas on the list in fact owe at least as much to collaborators as they do to me. This likely is a big part of the rise in number of ideas after becoming faculty. Although I lost some of my open creative time after beginning as faculty, I gained a set of fantastic students.

Conclusions

Generating and selecting among ideas is an art, one of the most important arts to learn over years of grad school. I will never feel that I've truly mastered that art. But studying my own history has suggested some strategies and conditions that seem to help, or at least seem to help me.

Ideas are more likely to appear when I have time or create time to think creatively, rather than simply appearing for free.

They often need to germinate over a period of months or years.

And perhaps most importantly, they are most likely to grow out of connections with other work and other people.

Live-blogging HotNets 2012, Day Two

This is Day Two. Day One is here.

Mobile and Wireless

Calum Harrison presented work on making rateless codes more power-efficient. Although rateless codes do a great job of approaching the Shannon capacity of the wireless channel, they're computationally expensive, and this can be a problem on mobile devices. This paper tries to also optimize for cost of decoding measured in terms of CPU operations, and gets 10-70% fewer operations with competitive rate. [Calum Harrison, Kyle Jamieson: Power-Aware Rateless Codes in Mobile Wireless Communication]

Shailendra Singh showed that there isn't one single wireless transmission strategy that is always best. DAS, Reuse, Net-mimo — for each there exists a profile of the user (are they moving, how much interference is there, etc.) for which that scheme is better than the others, which this paper experimentally verified. TRINITY is a system they're building to automatically get the best of each scheme in a heterogeneous world. [Shailendra Singh, Karthikeyan Sundaresan, Amir Khojastepour, Sampath Rangarajan, Srikanth Krishnamurthy: One Strategy Does Not Serve All: Tailoring Wireless Transmission Strategies to User Profiles]

Narseo Vallina-Rodriguez argued for something that may be slightly radical: "onloading" traffic from a wired DSL network onto wireless networks. We sometimes think of wireless bandwidth as a scarce resource, but actually your wireless throughput could easily be twice your DSL in some situations. If there is spare wireless capacity, why not use it? 40% of users use less than 10% of their allocated wireless data volume. They tested this idea in a variety of locations at different times and can get order-of-magnitude improvements in video streaming buffering. Apparently the reviewers noted that wireless providers wouldn't be a big fan of this — but Narseo noted that his coauthors are all from Telefonica. Interesting question from Brad Karp: How did we get here? Telefonica owns the DSL and wireless; if you need additional capacity is it cheaper to build out wireless capacity or wired? The answer seems to be that wired is way cheaper, but we need to have wireless anyway. Another commenter: this is promising because measurements show congestion on wireless and DSL peaks at different times. Open question: Is this benefit going to be true long term? [Narseo Vallina-Rodriguez, Vijay Erramilli, Yan Grunenberger, Laszlo Gyarmati, Nikolaos Laoutaris, Rade Stanejovic, Konstantina Papagiannaki: When David helps Goliath: The Case for 3G OnLoading.]

Data Center Networks

Mosharaf Chowdhury's work dealt with the fact that the multiple recent projects improving data center flow scheduling are dealing with just that — flows — with each flow in isolation. On the other hand, applications mean there are dependencies: for example, a partition-aggregate workload may need all of its flows to finish, and if one finishes earlier, it's useless. The goal of Coflow is to expose that information to the network to improve scheduling. One question that was asked was what is the tradeoff with complexity of the API. [Mosharaf Chowdhury, Ion Stoica: Coflow: An Application Layer Abstraction for Cluster Networking]

Nathan Farrington presented a new approach to build hybrid data center networks, with both a traditional packet-switched network and a circuit-switched (e.g., optical) network. An optical switch provides much higher point-to-point bandwidth but switching is slow — far too slow for packet-level switching. Prior work used hotspot scheduling, where the circuit switch is configured to help the elephant flows. But performance of hot spot scheduling depends on the traffic matrix. Here, Nathan introduced Traffic Matrix Scheduling: the idea is to repeatedly iterate between a series of switch configurations (input-output assignments), such that the collection of all assignments fulfills the entire traffic matrix. Q: Once you reach 100% traffic over optical, is there anything stopping you from eliminating the packet switched network entirely? Still there is latency on the order of 1 ms to complete one round of assignments; 1 ms is much higher than electrical DC network RTTs. Q: Where does the traffic matrix come from? Do you have to predict, or wait until you've buffered some traffic? Either way, there's a tradeoff. [Nathan Farrington, George Porter, Yeshaiahu Fainman, George Papen, Amin Vahdat: Hunting Mice with Microsecond Circuit Switches]

Mohammad Alizadeh took another look at finishing flows quickly in data centers. There are a number of recent protocols which are relatively complex. Their design is beautifully simple: each packet has a priority, and routers simply forward high priority packets first. They can have extremely small queues since the dropped packets are likely low priority anyway. End-hosts can set each packet's priority based on flow size, and perform very simple congestion collapse avoidance. Performance is very good, though with some more work to do for elephant flows in high-utilization regimes. [Mohammad Alizadeh, Shuang Yang, Sachin Katti, Nick McKeown, Balaji Prabhakar, Scott Shenker: Deconstructing Datacenter Packet Transport]

Lunch!

Routing and Forwarding

Gábor Rétvári tackled a compelling question: How much information is actually contained in a forwarding table? Can we compress the FIB down to a smaller size, making router hardware simpler and longer-lasting? Turns out, there's not so much information in a FIB: with some new techniques, a realistic DFZ FIB compresses down to 60-400 Kbytes, or 2-6 bits per prefix! A 4 million prefix FIB can fit in just 2.1 Mbyte of memory. Now the interesting thing is that this compression can support reasonably fast lookup directly on the compressed FIB, at least asymptotically speaking, based on an interesting new line of theory research on string self-indexing. One problem: They really need more realistic FIBs. The problem is that widely-available looking glass servers obscure the next-hops, which affect compression. "We are inclined to commit crimes to get your FIBs." Before they turn to a life of crime, why not send them FIBs? They have a demo! Question for the future: Can we use compressed forwarding tables at line speed? [Gábor Rétvári, Zoltán Csernátony, Attila Körösi, János Tapolcai, András Császár, Gábor Enyedi, Gergely Pongrácz: Compressing IP Forwarding Tables for Fun and Profit]

Nicola Gvozdiev wins the award for best visualizations with some nice animation of update propagation among iBGP routers. Their work is developing the algorithms and systems necessary to propagate state changes in iBGP, without causing any transient black holes or forwarding loops. [Nikola Gvozdiev, Brad Karp, Mark Handley: LOUP: Who's Afraid of the Big Bad Loop?]

Vasileios Kotronis's work takes SDN-based routing a step further: Don't just centralize within a domain, outsource your routing control to a contractor! One cool thing here, besides reduced management costs, is that you can go beyond what an individual domain can otherwise do — for example, the contractor has interdomain visibility and can perform cross-domain optimization, debug policy conflicts, etc. [Vasileios Kotronis, Bernhard Ager, Xenofontas Dimitropoulos: Outsourcing The Routing Control Logic: Better Internet Routing Based on SDN Principles]

User Behavior and Experience

Rade Stanojevic presented results from a large data set of mobile service plans (roughly a billion each of calls, SMS/MMS messages, and data sessions). The question: Are economic models of how users select bandwidth and service plans realistic? What choices do real people make? In fact, only 20% of customers choose the optimal tariff. 37% mean overpayment, 26% median. Another interesting result: use of service peaks immediately after purchase, and then decays steadily over at least a month, even with unlimited service (so it's not just because people are conservative as they near their service limits). Several Questions: Do these results really demonstrate irrationality? Users may buy more service than they need, so they don't need to worry about (and pay) comparatively pricey overage fees. Comment from an audience member: One has to imagine the marketing department of Telefonica has that exact same CDF of "irrationality" as their metric of success. [Rade Stanojevic, Vijay Erramilli, Konstantina Papagiannaki: Understanding Rationality: Cognitive Bias in Network Services]

Athula Balachandran presented a study working towards a quantitative metric to score user experience of video delivery (in particular, how long users end up watching the video). The problem here is that predicting user experience based on quantitative observables is hard: it's a complex function of initial startup delay, how often the player buffers, buffering time, bit rate, the type of video, and more. The paper analyzes how well user experience can be predicted using several techniques, based on data from Conviva. [Athula Balachandran, Vyas Sekar, Aditya Akella, Srinivasan Seshan, Ion Stoica, Hui Zhang: A Quest for an Internet Video Quality-of-Experience Metric]

Vijay Erramilli presented a measurement study of how web sites act on information that they know about you. In particular, do sites use price discrimination based on information they collect about your browsing behavior? Starting with clean machines and having them visit sites based on certain high- or low-value browsing profiles, they could subsequently measure how a set of search engines and shopping sites present results and prices to those different user profiles. They uncovered evidence of differences in search results, and some price differences on aggregators such as a mean 15% difference in hotel prices on Cheaptickets. Interestingly, there were also significant price differences based on the client's physical location. Q from Saikat Guha: How can you differentiate the vendor's intentional discrimination from unintentional? For example, in ad listings, having browsed a certain site can cause a Rolex ad to display, which bumps off an ad for a lower priced product. [Jakub Mikians, László Gyarmati, Vijay Erramilli, Nikolaos Laoutaris: Detecting Price and Search Discrimination in the Internet]

That's it! See you all next year...

Live-blogging HotNets 2012

Note: This blogging might be rather bursty. If you want something more deterministic, here's the HotNets program.

This is Day One. Day Two is here.

Session 1: Architecture and Future Directions

Teemu Koponen spoke about how combining the ideas of edge-core separation (from MPLS), separating control logic from the data plane (from SDN), and general-purpose computation on packets (from software routers) can lead to a more evolvable software defined Internet architecture. [Barath Raghavan, Teemu Koponen, Ali Ghodsi, Martin Casado, Sylvia Ratnasamy, Scott Shenker: Software-Defined Internet Architecture]

Sandeep Gupta discussed rather scary hardware trends, including increasing error rates in memory, and how this may affect networks (potentially increasing loss rates). [Bin Liu, Hsunwei Hsiung, Da Cheng, Ramesh Govindan, Sandeep Gupta: Towards Systematic Roadmaps for Networked Systems]

Raymond Cheng talked about how upcoming capabilities which will be widely deployed in web browsers will enable P2P applications among browsers, so free services can really be free. Imagine databases in browsers, or every browser acting as an onion router. [Raymond Cheng, Will Scott, Arvind Krishnamurthy, Tom Anderson: FreeDOM: a new Baseline for the Web]

Session 2: Security and Privacy

Scott Shenker examined how to build inter-domain routing with secure multi-party computation (SMPC), to preserve privacy of policies. The idea is that interdomain routing really is a multi-party computation of global routes, and participants want it to be secure. The benefits of using SMPC: autonomy, privacy, simple convergence behavior, and a policy model not tied to computational model. The last item should be emphasized: there's a lot more potential policy flexibility here with a much easier deployment story, just changing software at the set of servers running the computation. For example do other classes of policies have different or better oscillation policies? Part of this (convergence) seems to connect with Consensus Routing. Jeff Mogul mentioned an interesting point: By adding the layer of privacy it may be very hard to figure out what's going on inside the algorithm and debug why it arrived at a particular result. [Debayan Gupta, Aaron Segal, Gil Segev, Aurojit Panda, Michael Schapira, Joan Feigenbaum, Jennifer Rexford, Scott Shenker: A New Approach to Interdomain Routing Based on Secure Multi-Party Computation]

Katerina Argyraki spoke about how we can change the basic assumption of secure communication: creating a shared secret not based on computational difficulty, but on physical location. The idea is to use different wireless interference across location. Security is more robust that you might think, in that you just need a lower bound on how much information Eve misses, rather than which pieces of message Eve missed. An implementation generated 38 secret Kbps between 8 nodes. However in a few corner cases Eve learned a substantial amount about the secret. There is some hope to improve this.[Iris Safaka, Christina Fragouli, Katerina Argyraki, Suhas Diggavi: Creating Shared Secrets out of Thin Air]

Saikat Guha linked the problem of data breaches to money and proposed data breach insurance ("Obamacare for data") In a survey, 77% of users said they would pay, a median of $20. (Saikat thought this may be optimistic.) They're working to develop a browser-based app to monitor user behavior, offer individuals incentives to change to more secure behavior, and see if people actually change. [Saikat Guha, Srikanth Kandula: Act for Affordable Data Care.]

Lunch!

Session 3: Software-Defined Networking

Aaron Gember spoke about designing an architecture for software defined middleboxes, taking the idea of SDN to more complex processing. Distributed state management is one challenge. [Aaron Gember, Prathmesh Prabhu, Zainab Ghadiyali, Aditya Akella: Toward Software-Defined Middlebox Networking]

Monia Ghobadi has rethought end-to-end congestion control in software-defined networks. The work observes that TCP has numerous parameters that operators might want to tune — initial congestion window size, TCP variant, even AIMD parameters, and more — that can have a dramatic effect on performance. But the effects they have depend on current network conditions. The idea of the system they're building, OpenTCP, is to provide an automatic and dynamic network-wide tuning of these parameters to achieve performance goals of the network. This is done in an SDN framework with a central controller that gathers information about the network and makes an educated decision about how end-hosts should react. Experiments show some very nice improvements in flow completion time. Questions: Did you see cases when switching dynamically offered an improvement? And in general, how often do you need to switch to get near the best performance? Some of that remains to be characterized in experiments. [Monia Ghobadi, Soheil Hassas Yeganeh, Yashar Ganjali: Rethinking End-to-End Congestion Control in Software-Defined Networks]

Eric Keller, now at the University of Colorado, spoke about network migration: Moving your virtual enterprise network between cloud providers, or moving within a provider to be able to save power on underutilized servers, for example. Now, doing this while keeping the live network running reliably is not trivial. The solution here involves cloning the network and using tunnels from old to new, and then migrating VMs. But then, you need to update switch state in a consistent way to ensure reliable packet delivery. Some questions: How do you deal with SLAs, how do you deal with networks that span multiple controllers? [Eric Keller, Soudeh Ghorbani, Matthew Caesar, Jennifer Rexford: Live Migration of an Entire Network (and its Hosts)]

Session 4: Performance

Ashish Vulimiri presented our paper on making the Internet faster. The problem: Getting consistent low latency is extremely hard, because it requires eliminating all exceptional conditions. On the other hand, we know how to scale up throughput capacity. We can convert some extra capacity into a way to achieve consistent low latency: execute latency-sensitive operations twice, and use the first answer that finishes. The argument, through a cost-benefit analysis and several experiments, is that this redundancy technique should be used much more pervasively than it is today. For example, speeding up DNS queries by more than 2x is easy. [Ashish Vulimiri, Oliver Michel, P. Brighten Godfrey, Scott Shenker: More is Less: Reducing Latency via Redundancy]

The questions are getting interesting. Where is Martha Raddatz?

Udi Weinsberg went in the other direction: redundancy elimination. This is an interesting scenario where a kind of content-centric networking may be a big help: in a disaster which cuts off high-throughput communication, a DTN can provide a way for emergency response personnel to learn what response is most effective, through delivery of photos taken by people in the disaster area. But in this scenario, as they have verified using real-world data sets, people tend to take many redundant photos. Since the throughput of the network is limited, smart content-aware redundancy elimination can more quickly get the most informative photos into the hands of emergency personnel. [Udi Weinsberg, Qingxi Li, Nina Taft, Athula Balachandran, Gianluca Iannaccone, Vyas Sekar, Srinivasan Seshan: CARE: Content Aware Redundancy Elimination for Disaster Communications on Damaged Networks

Onward to Day Two...