Categories
The Trenches

How 400 lines of code saved me from madness: Using computers to organize high-attendance events

Note to readers: I can’t really show pictures of the event because most participants were underage and I don’t have a release for having them here. This makes pictures look slightly off-topic, but they are the only ones I can actually show.

About a year ago, I got myself involved as a logistics officer for an event with an attendance in excess of 400 people. For five days, we had to make sure all of them had food, transport, merchandising, a place to sleep, a group to work with, and a good time.

We were constrained in budget, time, and manpower: we wanted to do everything with nothing, and there was just way too much data to crunch by hand: brute-forcing this wasn’t going to work; exactly the kind of situation that gets me excited for some outside-of-the-box thinking.

We used computers, and they worked beautifully: this is how we kept track of the lives of 428 people for five days without losing our minds.

Dreams, experiences, and realizations

Carlos Dittborn, one of the people responsible for hosting the FIFA World Cup in Chile in 1962, had a quote that has resonated in this country for more than sixty years:

Because we have nothing, we want to do everything.

Carlos Dittborn, on an interview during qualifiers for deciding the host of the 1962 FIFA World Cup.

If you’re anything like me, this quote makes transparent the ethos of makers: It’s not about anything other than solving a problem; the thrill of making something work is enough, and that makes us do whatever it takes to achieve it.

This project was born out my work as a consultant and youth leader in a religious organization: we had groups scattered all across the country, and they all knew of each other, but there hadn’t been an instance for all of us to come together as one. The last time something like that had been attempted was back in 2015, and most people who were currently in the lines of our youth groups were way too young to be there.

Overall, the first time was a success, but it highlighted a key aspect of why using machines matter: there was just way too much information to keep track of using just the mind and pieces of paper. It’s a remarkable achievement that it worked anyways, but it took a tremendous effort by around 40 people a couple of months to get all the data crunched, and many mistakes were made during the event; things got lost, food got delayed, and people didn’t know where to be. Keeping up tabs on 400-ish people is a task that is outside of what the human mind can manage on its own, so better solutions were needed.

There’s also the age variable here: these people are mostly kids. While they are old enough that they don’t need permanent supervision, adolescents are not exactly known for following the rules or caring much about what the adults have to say, so the people in charge needed to be free of burdens of information that could be better managed by computers.

In practical terms, we needed:

  • Receiving all inscriptions to the event in quasi-real-time, so we could set up sleeping accommodations for all participants and their leaders, along with food and health requirements, transportation, etc.
  • All participants had to be assigned groups in which all the activities were to be carried out. These also had requirements regarding gender, age, and place of origin.
  • We needed to keep track of how many people ate during all meals down to individual people, both to ensure correct payment of food and to ensure no kid skipped more than a single meal.
  • We had to transport everyone to a different location during the second day, which meant getting them all into the correct buses both heading to and leaving from the new location.
  • For some activities, the kids would rank their preference from a number of choices in order to participate in events. These answers had to be tallied, the events assigned, and the attendance lists handed out to participants and organizers alike.
  • Supplies had to be handed out individually to each participant, keeping track of who had what.

This is all pretty par for the course for any event, but there was another variable for us to wrangle: this all had to be done at a breakneck pace, by only five people, and with a budget of around 500 USD.

A need for speed (for cheap)

If it were maybe a couple dozen people, all of this could easily be made with a copy of Microsoft Office and some macros. The main problem here was speed, groups had to be assigned as soon as the inscription data was ready, all 400 people had to be checked in every meal in less than an hour, and loading people into the buses had to be done in under 90 minutes. It was an ambitious plan, but doing it any other way would have meant to take away time from the things that actually mattered. Another problem was concurrency: for the math to work out, we needed multiple people filling in data in our tables in real time, while maintaining operator independence. For some applications, we also needed real-time monitoring that gave us insights of what exactly was going on, along with methods for ingesting data from many different sources.

You kinda-sorta can do this with a cloud solution like Google Sheets, but making complex operations like assigning groups with set criteria is something that a simple spreadsheet just can’t really do comfortably: a cell is a cell, and operating with data arrays as results is just out of reach. I do know about Excel macros, but they have always felt like a hack to me. An actual programming language with a database is way more flexible.

On the other hand, how do you make the user be faster? Even asking for something like a name was way too much time for this, and memorizing a number would bring us to a standstill the second someone forgot theirs.

A solution materializes

Not everything in life is having fun and playing the guitar.

The backbone of all of this had to be data storage and manipulation solution: A way to store structured data, run queries, and that satisfied our speed and concurrency requirements: we needed a database.

Fortunately, we already had a MySQL server on an internet-facing machine that we could take advantage of. Database engines are like an Excel spreadsheet on steroids: they can process truly mind-boggling amounts of information, run queries, automatically calculate data, answer multiple queries? you name it, it can do it. Unlike a spreadsheet though, accessing a database requires the use of the Structured Query Language (SQL), which meant if we were going to use this with non-engineers, a frontend was needed.

For this, I need to come clean: I’m not a very good programmer. Sure, I can build things that work, but they aren’t going to be pretty. I picked Python as my language of choice, interacting with MySQL servers is very well documented and data processing can be done fairly quickly if you’re not sloppy about where your data goes. For a while I toyed with making a graphical interface using Tkinter and then curses, but as I was only looking to accept keyboard input and display basic data, a simple terminal would do, inputs and prints was the way to go. This would prevent me from wasting hours debugging the interface, and gave me much more speed when it came to do new iterations and modifications.

A sample interface using print and input. The entire interface was reprinted which each cycle, but this was acceptable for our needs.

To get a high throughput where counting people was required, I turned to the retail industry: I used Code-39 barcodes, each one encoding a four digit number assigned to each participant. Barcode readers are cheap and most of them support Code-39, and making it work with my software was dead easy: when you plug in a barcode reader to your computer, it detects a keyboard: all characters are written as if someone had typed them, and then it automatically presses the Enter key. This had two key advantages: a simple input field could read barcodes pretty much as fast as we could scan them, and if a code became damaged, it could be typed out manually.

A sample label for participants. The number next to their name is encoded in the barcode. G and B fields correspond to their group and bus respectively.

All the rest of the data processing was made using Python scripts that read the database and calculated everything we needed. There were two types of scripts; the first were frontends for real-time operations, such as:

  • Keeping track of meals.
  • Giving out t-shirts and notebooks.
  • Check-in for bus transportation.

The second type were one-time scripts that calculated new data from what we already had:

  • Assigning work groups for each participant, along with their leaders. These groups had age, sex, and origin requirements.
  • Assigning recreational activities and lectures for all participants based upon a list of preferences.

This gave us a new challenge however: We had never arrived to explicit criteria for making groups like these before, so how could we make a machine think like a human?

A machine thinking like a person

Everyone that has had to organize an event of pretty much any kind has almost surely come to this conundrum: How do you split a large group into smaller groups?

For small groups, rules of thumb and educated guesses is enough, but beyond maybe 70-80 people, its starts to get incredibly tedious, and mistakes are pretty much bound to happen. Computers are deterministic machines however, so they could make a classification problem like this one, provided we can explain all criteria for it. So, how do we do it?

Workgroups

Let’s look at our first example: we need to divide all participants into groups of 10 people. Before we get into the weeds with programming, we need to ask the fundamental question of this problem: What makes our groups better? Let’s see what we know:

  • All groups are to perform activities under the supervision of two adult leaders. These tasks are centered around knowing each other, sharing opinions and experiences, looking to find common experiences among different groups.
  • All participants are teenagers aged 14 to 18 years old, and they’re usually split into age groups within their places of origin, so mixing them up would be preferable.
  • Participants are of mixed gender, with a slight majority of women.
  • Participants are very likely to know people from their own place of origin, and likely to know people from nearby places of origin, as they often have joint activities. A key aspect of this event is for them to get to know more people, so we need them as separated as possible.

With this in mind, we can arrive at some criteria:

  • Groups will be of 10 or 11 people, a necessity in the very likely scenario that the number of participants is not evenly divisible by 10.
  • Groups have to conserve the gender ratio as much as possible: if the whole is 40% men and 60% women, then all groups should have 6 women and 4 men.
  • Places of origin should be split as much as possible along all groups, to avoid concentrations of people that know each other. Peoples of the same local area should also be split if possible.
  • Ages of participants will have to be as mixed as possible.

Great, now we have our criteria. How can we turn this into code?

The approach I went with involves sorting the table of participants and then assign a group to each one in a rotating fashion. I will call this method the sort-and-box, as those are the two key steps involved. Conceptually, there is a box for each group, and each participant will be assigned a box sequentially: the first person goes to group 1, the second to group 2, and so on. Once we’re out of boxes, we roll over back to the first one and we stop once all people have been assigned groups. If we have sorted the table correctly, this will guarantee maximum dispersion among participants, but it creates a new challenge: how can we sort the list of participants?

conceptual view of the sort-and-box method

This approach has a single core tenet: if you group people together, they will end up in different groups; there is no way to consecutive participants in the table end up in the same group, so to get maximum dispersion, we need to sort all the people that need to be apart, together. Also, we can sort by more than one criteria by doing recursion: we sort inside the already sorted blocks. This creates a tree of sorted participants, where the first sort has the first priority when splitting, then the second, and so forth. An additional layer of separation can be achieved by doing the larger groups (by place of origin) first, which ensures large groups (which are the hardest ones to split) are split as evenly as possible, without interference from smaller groups.

With this, a Python script was created that did the following steps:

  1. Get the list of participants from the SQL database.
  2. Calculate a multiplier for each place of origin. This value corresponds to the multiplication of two numbers: the number of people from that place of origin divided by the total number from the local area, and the number of people in the local area divided by the total number of people in attendance. If we sort by this number, the bigger groups and zones will be sorted first. These numbers are appended to each participant.
  3. Sort the table by origin multiplier, gender, and age, in that order.
  4. Create empty groups by dividing the number of participants by 10 and discarding decimals.
  5. Assign each person a group using the round-robin method described above.
  6. Add a column to the table with each participant’s group.

At last, our work groups were complete. Sorting all 400-ish participants took around three seconds, and I’m sure you could make it faster if you wanted. This meant more time to receive inscription data and more time to print out the barcode labels we needed.

Activities

In two instances, groups would separate among different lectures and recreational activities respectively. This was thought both as an opportunity for them to share experiences within the event and also to generate spaces in which people could meet outside of their groups. For both cases, these were the requirements:

  • Each activity had a limited maximum number of participants.
  • Participants had to decide which activity they wanted to attend, but the last call was made by us.
  • Participants who got their preferences in first would get priority when separating groups.

As we had been using Google Forms for most of this, participants were asked to rank all possible choices from first option down to last. That data was then imported into a SQL table and then processed as follows:

  • Get the maximum number of participants for each activity.
  • Each participant is analyzed separately, with the earliest ones to respond going in first.
  • For each participant, the first option was checked. If there were remaining seats, then the person would be assigned that activity.
  • If there weren’t available seats, the program would move to the next option until one is found.

Because there were more seats than participants, everyone was assigned a place, even if it was their last choice. Having a program do this meant that the lists could be collated mere minutes after the poll was closed, which meant more time to sort out all activities and more time for participants to get their choices in.

An excerpt of the table given to participants (with their names hidden) for checking into their activities. This was made entirely automatically.

Overall, this is an exercise in teaching ourselves about our own process of thought, so we can teach some sand how to do it for us. Best of all, sand can usually do it faster and more consistently than a human can.

Want to know more? Check out my GitHub repo! It has all scripts I used for this.

Duct tape and cardboard

Prussian general Helmuth von Moltke the Elder, one of the key figures in the unification of Germany, once said:

No plan survives first contact with the enemy.

Helmuth von Moltke the Elder, Kriegsgechichtliche Einzelschriften (1880)

There is some nuance to this quote: there are obvious advantages to planning ahead, but all plans will soon have to face unforeseen challenges; the real world is too complex, here are too many variables at play for all of them to be accounted for, so you have to be flexible in order for your objectives to come true.

This project was certainly no exception. For one, all of these scripts and data analysis tools were very vulnerable to the Garbage-In Garbage-Out problem: if inscription data had errors or voids, everything started breaking real fast. Because this data ultimately came from humans, we needed to make sure the user could not make a mistake even if they wanted to: Each place of origin received a Google Sheets spreadsheet in which they had to type all of the information regarding their participants, so how could we idiot-proof it?

In comes data validation: spreadsheets have become leviathans of features, and one of them is the ability to inspect the data as it is typed so that nothing that doesn’t match specific rules can be inserted. First, all cells that were not to be edited were locked, preventing the user from breaking things like calculated fields or sheet concatenation. Then, all input data was assigned a type: date values had to actually work and have a specific format (DD-MM-YYYY), phone numbers had to have a specific length and area code (using regular expressions to match characters, something Excel can’t do but Google Sheets can), emails had to be of valid syntax, and so on. Also, once you filled out a single field in a row, you had to fill all others, otherwise you got an error.

Once all the sheets were received, a big table was made using all the data, which of course had to be skimmed by a human before ingestion: you make something idiot-proof, and the world makes better idiots. Fortunately most of the errors had been caught in time and only minor corrections had to be made.

Then came the barcodes. Our initial plan was to make wristbands and printing the codes on them: we even had test runs made in order to check if our readers could pick up the codes correctly. However, a week before the event, the printing press informed us that they would not be able to fulfill our order in time. This not only meant we needed a different way to get the barcodes to the participants, it also meant we had to design our own labels too, since the press was going to handle that in-house.

We quickly solved it using the provided notebooks we were giving out: a simple paper label on the rear cover had us covered no problem, but how could we make 400 personalized labels in just a couple of days?

The answer is very simple: Microsoft Word. As it turns out, it has an incredibly powerful label maker, which can take a table of raw data and churn out labels to your heart’s content. It can even make barcodes on the fly, which was very handy for this occasion. In about two hours we had all labels designed and printed, and in the afternoon all of them had been placed inside of the notebooks. It was tight, as it was finished the day before the event, but it was enough to save us from scrapping the entire control system.

The day comes

I was wearing many hats during this whole ordeal, live sound included.

Our first task turned out to be excellent for fixing late bugs and smoothing out errors: each place of origin arrived separately, and each participant was given a t-shirt and the aforementioned notebook. For each one, the barcode was scanned in order to keep track of who had received what. Because all places of origin arrived at different times, the pace was relatively quiet and we could manually ingest data if need be. Some bugs were found and quickly patched in anticipation for the big event: the meals.

The problem we had with meals was one of speed: our calculations showed that in order to feed everyone in the allocated time, we would be pushing out a meal every 13 seconds. Our catering crew was good, but if we slowed down even by a tiny bit, we could jeopardize our ability to serve everyone in time. Failure was out of the question. Our excitement was palpable then, when the queue started backing up, but not because we couldn’t keep up, but due to the serving line being at capacity and reaching where we were scanning the barcodes. Even with a single scanner we could keep up with the serving rate, with two being enough for pretty much all situations.

Assignment of activities was also a huge success: from composing the inscription form to distributing results to participants was done in a couple of hours, with only minor tweaking needed to be done for each time, which was now possible due to the massive time savings. Overall participants were very satisfied with the distribution of activities and our policy of transparency regarding the rules for election made it so that we got almost no complaints or last minute rearrangements.

Our biggest hurdle speed-wise was the boarding of busses: 10 buses of 45 people each had to be completely filled up and emptied out in both directions, with no more than 60 minutes each for every one: our biggest hurdle was actually getting people to move fast enough to each bus, but after a few slower runs, we had found our rhythm, and the return journey was even done ahead of schedule, with all buses boarded in under 50 minutes.

Even after the event itself our control scheme kept being useful: getting attendance numbers for activities helped us review the most popular ones, and the meals were within 5% of what the catering crew actually served, which convinced us and them that we were paying the right amount of money for over 2000 meals served.

Lessons learned

This entire project was a massive success from beginning to end: while most tech-savvy people will agree that this is not a particularly complicated project, introducing a rather modest amount of computational power to an event that requires this much information processing generates enormous time savings, which has one key consequence: we, the people making this event work, had more time to attend to the kind of issues you cannot account for beforehand, instead of focusing on trying to wrangle an overgrown spreadsheet.

Another advantage is one of consistency: when the rules of the game are clear, computers make a better job at making decisions than we do, and if we make a conscious effort to eliminate bias within our algorithms, we can create fairer solutions that maximize your chances of consistently good data. Being transparent about what your code does also generates legitimacy to an unknown software: if you can explain what your code does in layman’s terms, chances are people will trust and follow the instructions it gives. Be cautious however; computers are bound to the biases and prejudices of whoever programs them, do not put trust in them blindly.

Even when problems came up (uncaught bugs, lack of functionality, and a need to adapt to changes in the event program), our software was so simple that a couple of lines of code was usually enough to get the software to do what we wanted. We even ran the assignment scripts multiple times when bugs were caught, and everything was so fast that we could redo everything with minimal time loss.

Conclusions

Perhaps the most astute observation from all of this is one of systems architecture: machines have to be designed in order to serve you, not the other way around. If you create an automaton that takes advantage of its strengths, offloading mind-numbing work from people makes them in turn more useful, because you’re taking advantage of their humanity; More time for us meant we could plan everything out better in advance, the trust we placed in the machine meant we had more time to think about what we were doing and why: we wanted to give this kids an unforgettable experience, and to give them the chance to grow as people, together, and that is something that machines can’t do.

What we do with systems and machines and automation we do because it gives us our humanity back from the void of data. To adopt technology is an imperative, not because it’s just fun or just useful, it’s because it gives us the tools we need to comprehend and interact with an increasingly complex world.

I sometimes feel that right now technology is on trial in the public consciousness: the endless trot of innovation often makes us jaded and skeptical of adopting these tools and for good reason; like all tools, they are values-neutral, it’s up to us to decide how and why we use them.

They are also many reasons to be distrustful: we’re getting a better understanding on how social media affects negatively affects our relationships and self-image, we’ve seen what tech companies are capable of doing for a quick buck, and after a pandemic that had us staring at screens for sixteen hours a day, it’s understandable that we wish to escape these machines for good.

But these machines also gives us many unprecedented abilities: To communicate at high speed to and from anywhere in the world, to generate models that help us understand the world around us, to create new solutions, to save lives, to use them as tools of freedom and dignity, to preserve our past, to shape our future, and to allow us to focus on being human.

What we created here is not any of these things, but it allowed us to create an event that I’m sure will be a high-spot for everyone who attended it. We had the tools at our disposal, and we used them effectively to create an amazing experience for everyone, without losing our sanity in the process, and that feels great. It fills you up not only with pride and joy, but with a tremendous sense of accomplishment and purpose.

So please, if you can, use them to create systems that decrease the amount of suck in this world. Craft new experiences, push the boundaries of what is possible, be amazed by its sheer power, and maybe you will create something amazing in the process.