Tag: Philosophy

  • The Limits of Analysis Pt. 2

    The Limits of Analysis Pt. 2

    The Conditions of Analyzability

    In Part 1, I argued that not everything can be turned into data by means of analysis. Some experiences resist detection, some collapse under measurement, and others vanish once we try to replicate them. But when analysis is possible, it depends on certain hidden preconditions, which I’ll call the five conditions of analyzability. We will look at 3 here today, and 2 in the next article. Some, if not all, of these points might come off as truisms at first. But the point of this part of the series is to take a deeper look at the hidden assumptions that lie at the heart of analysis. These conditions serve as thresholds: they are not guarantees of insight, but rather the minimum requirements for analysis to even begin. Without them, statistics and machine learning alike produce empty formalisms, outputs untethered from reality.

    The first condition we will discuss is detectability. In short, the signal in question must exist, but not merely exist, it must exist in a context that gives off a detectable trace for us to observe. If there is no signal to detect, then there is no starting point for our analysis to take place. A perfect example from statistics for this condition is the concept of hypothesis testing in statistics. Hypothesis testing depends on being able to distinguish between random noise and a real effect. If the effect is completely buried in noise, inference collapses. And suppose we apply this to machine learning. In that case, we can see that with this viewpoint on this condition, the ability of a system to learn about a signal is dependent on the signal’s appearance in the distribution of data you’re analyzing. Back to the fraud example from part 1 of the series, we can see that fraud can exist, but by its nature, it tends to escape detection by deliberately avoiding creating a detectable signal.

    Beyond the detection of the signal, we need to be able to quantify the signal. In general, we can think of this condition as measurability, in terms analogous to the magnitude or strength of the signal. We must be able to structure what we detect into a form that can be compared, ordered, or aggregated. This is manifest in statistics with your choices of “measurement scale”. Not all numbers mean the same thing; the way something is measured determines what kinds of comparisons are valid. Going back to the customer satisfaction example from part 1 of this series, we can see how something like “on a scale of 1 to 5, how satisfied with your visit were you?” converts a complex feeling into a narrow range of integers. But when analysis is performed on the satisfaction data, you’ll often see results like an average of 3.5 out of 5. When you apply quantitative analysis to qualitative measurements, you risk producing results that look precise but aren’t truly quantitative. Treating qualitative judgments as if they exist on an interval scale creates an illusion of structure where there may be none. When it comes to machine learning, this is mirrored by means of embeddings, normalizations, and encodings of the data. All of these are attempts at taking your real-world messy data and organizing it all into signals of measurable and consistent quantities, which we can then learn from. The translations from these real-world data sets to the cleaner and learnable sets will always be a reduction, be it quantitative or qualitative data. But every translation is also a reduction: nuance becomes structure, texture becomes number. Whether in statistics or machine learning, measurability always involves a trade-off between richness and regularity, between what is real and what is processable.

    Next, let’s talk about the sufficiency of data to create generalizable results as a condition of analysis. Because even with measurable data, scarcity limits inference. Again, if we look at our two main frameworks of application for this philosophical analysis, machine learning and statistics, we can see great examples of this condition represented. In statistics, samples produce unstable estimates, wide confidence intervals, and high variance. The law of large numbers only works when “large” actually applies. The law of large numbers is a crucial and pivotal result of statistics that says more or less that the more data you have, the closer to the “actual” results you’ll observe in the data. A concrete example is flipping a coin, flipping it twice, and you might get heads twice, but that doesn’t mean that the coin will flip heads 100% of the time. You need to flip the coin more and more times to be more confident about the odds of a specific outcome. In machine learning, sparse data makes models overfit, which is when models come up with relationships that work for the present data but don’t generalize to a wider variety of data. Sparse data can trick the model into learning “quirks” about the specific data set, rather than general rules.

    These first three conditions, detectability, measurability, and sufficiency, set the stage for everything that follows. They define the threshold where analysis becomes possible at all. Without something to detect, a way to measure it, and enough variation to generalize from, analysis risks turning into form without substance. In the next part, I’ll look at the final two conditions of analyzability, learnability, and replicability, and explore what happens when even well-structured data begins to fail those tests.

  • The Limits of Analysis Pt 1

    The Limits of Analysis Pt 1

    The Preconditions of Knowing

    This is Part 1 of a 3-part series on what makes a quantity analyzable. In this opening piece, I set the stage by exploring the limits of analysis itself, why some things can’t be turned into data at all. Part 2 will unpack the five conditions that make analysis possible, and Part 3 will reflect on what happens when those conditions fail.

    When it comes to machine learning and statistics, the impulse to model, predict, or explain is often very compelling. But, before any of that is possible, there is a more fundamental assumption we need to analyze, is the thing we want to analyze even analyzable? Not every phenomenon can be subsumed into the realm of data. Some things resist measurements, while others produce too little evidence to generalize from, and still others dissolve entirely when scrutinized for reproducibility. My main claim is that analysis isn’t automatic, it only works within the right conditions.

    I think it would help to give some precise examples of types of data in the real world we might be familiar with that demonstrate this claim. Customer satisfaction is a good example, it is easy to have a conversation about customer satisfaction, but very difficult to measure directly. A single review does not capture the entire situation when it comes to customer satisfaction. Bias also exists in that, people are more likely to leave a review if they have had a negative experience, rather than a positive experience. Fraud detection is another example, fraud does exist, but it only leaves subtle traces behind. If those traces are not detectable, the problem is unsolved by the data. Happiness is another example that empiricists and positivists have struggled to account for. Happiness is a deep human experience, but how do we adequately detect, quantify, and replicate it across individuals?

    As we look across both statistics and machine learning, we see a meta-pattern about analysis. For a quantity to be analyzable, it must meet certain conditions. In part 2 of this series, I will look into these conditions in much more detail, but we’ll start with a preview of the conditions for now.

    • The data must first be detectable, meaning it has to have some signal, however faint, for us to find.

    • The data must be measurable, meaning it should be able to be constructed into a structured form. This seems at the surface nearly identical to the first condition, but the requirement of structure on the data means we need to be able to do basic things like determine if one signal exceeds another signal, for example, or if there is a hierarchy of the signals we detect, something analogous to the strength of the signal.

    • Next, we need to have sufficient data, enough variation to meaningfully capture a representation of the pattern we wish to analyze. This is one reason why sample sizes in statistics are of such importance when making claims with a specific confidence level.

    • The data must also represent something learnable in the first place, about the data structure and the amount of data you have collected. Meaning, the data must be structured so that algorithms or models can generalize from them.

    • Finally, the data has to be replicable. This is another subtle assumption that will take time to analyze in deeper detail. But in general, we expect the data to be stable across samples and systems.

    In the next article, I’ll dive into each of these conditions, showing how they shape both the limits and the possibilities of data science. If Part 1 is about asking whether something can be analyzed, Part 2 is about learning how to test those conditions in practice.

  • The Algorithm Made It So Pt.3

    The Algorithm Made It So Pt.3

    From Lock-In to Liberation: Escaping the Algorithm’s Worldview

    In Part 2, I claimed that optimization is never neutral; it encodes assumptions about what counts as ‘best.’ In Part 3, I want to look at how these assumptions get locked in over time, creating systems that narrow our choices and shape our sense of reality itself.

    Performativity, following the methods I’ve outlined above, effectively “locks in” the model’s worldview. This can lead to a phenomenon called Path Dependence.  This is a phenomenon that has been studied a lot in specific areas like political theory and the theory of institutional evolution. Generally, Path Dependence says that the initial conditions, which are often contingent on seemingly random circumstances, can have effects later on in the development of the system that make other options become invisible or otherwise no longer an option due to standardization and coordination problems that make deviation prohibitively costly. Even without malicious intent, narrowing “best” causes some unintended consequences. An example of this that I particularly like is the QWERTY keyboard layout that most people are familiar with. It was created as a keyboard layout with typewriters in mind. But once the layout stuck, it became the norm that is so ingrained in the culture of tech that it’s hard to imagine a world in which there were actively competing keyboard layouts. Does this mean that the QWERTY keyboard layout is the “best” or most “optimized” version of a keyboard layout? Absolutely not! It was the simple historical contingencies that resulted in a decision that had a large impact on tech as a whole. So when we begin to consider the combination of path dependence compounded with performativity, it becomes very easy to suspect that the “best” result is just the result of historical, seemingly random, circumstances and the limits of our data and ability to make useful predictions from the data. A perfect example might be a rare disease whose occurrence is so rare that we can’t get enough data from samples to accurately predict a proper diagnosis for the disease.

                The French writer and philosopher Jean Baudrillard wrote about how simulations, images, and videos can become, or have an effect on, the perception of reality. Well, these algorithms and models precede and shape the ‘real’ world once they are implemented. They attempt to predict, but end up being prescriptive instead, in that the models end up dictating what the new standard is. The optimized version of ‘reality’ is the one most shown or generated by the models, furthering this idea that these chosen options are the only ones, or at least the ones that are the “best”, and you need not think any further. This is where we can see shades of the philosopher Guy Debord; these constant and recreated results generated by the algorithms and models create a ‘spectacle’. It’s what we see and interact with in society. In this view, optimization could also be seen as a subtle form of self-governance, structuring choices without overt coercion. These systems of control, created and self-enforced by these systems, would fall right in line with the French “historian of ideas” Michel Foucault. He wrote a book on discipline and punishment, and covers how you can break human history down into eras of methods of control. And these algorithms and machine learning models could be the next logical step of the information age and control. If you want a good credit score, there are specific things you need to do; you can’t just do anything to raise your credit score. The simulation becomes the only visible world, structured and enforced through subtle systems of governance.

                To help address the issue of performativity, we have 2 great examples to look at that demonstrate some ways to be aware of these blind spots in the development of machine learning algorithms. In our first article again, the Predictive Policing article, they effectively solved their problem by introducing an element in the model that minimizes the effects of arrest incidents of crime in that region. This makes sense in their case to effectively eliminate the portion of the data that itself was influenced by the model. Because if we send more cops to a region, we would naturally expect there to be more arrests to happen, as there are more officers to make arrests. In another model that recently made some waves, known as the Darwin Gödel Machine, this can help reduce the feedback loop by more exhaustively checking all the new iterations of models created by the system that will be used to make results, which also helps to avoid Path Dependence. Since this Darwin Gödel Machine checks all options for ‘surprise’ solutions, it helps avoid path dependence.  Other alternatives to consider are things like multi-objective optimization, where you attempt to balance competing values against each other to produce multiple solutions to consider, or to attempt to compare potential outcomes depending on how and why you’re optimizing for your specific metric. Deliberately introducing diversity and noise to a system can help to ensure that the system isn’t going to overfit as well. If Baudrillard warns us about simulation and Debord about spectacle, then introducing diversity and noise into models is one way to resist that narrowing of reality. Even if you account for all of this, there are again real-world incentives that challenge your model for misalignment, such as business pressures pushing toward optimizing narrow KPIs.

                The key takeaway I would like from this series of articles is that machine learning and algorithm optimization are performative; they shape the world to match their definition of “best”. This is not a reason to assume that these models are not helpful in some ways or should not be crafted. But it’s a reminder that the process of creating an unbiased model is vast, complex, and very difficult. This, coupled with the quick adoption of these algorithms and models into sensitive areas of society, can have grave unforeseen consequences. We must be diligent to ensure that our models are making as few assumptions as possible. This is just another challenge for software developers and data scientists to overcome, and it can be overcome; we just have to be aware of our blind spots in our analysis. Across this series, I’ve tried to show that algorithms don’t just reflect the world, they shape it, lock it in, and govern it. The challenge now is to design systems that widen possibilities rather than closing them off. If the algorithms tell us what is “best”, how long before we forget any other ways of being?

  • The Algorithm Made It So Pt. 2: Why ‘Optimized’ Doesn’t Mean ‘Best’

    The Algorithm Made It So Pt. 2: Why ‘Optimized’ Doesn’t Mean ‘Best’

    In Part 1, we saw how algorithms produce the realities they claim to measure. Here, in Part 2, I want to dig into what we mean when we say an algorithm is ‘optimized’, and why optimization never means a neutral or universal ‘best’. Let’s think more about what we mean when we say an algorithm is “optimized”. There is a branch of philosophy known as post-structuralism. It comes out of a long philosophical history, but they sensibly come from the structuralists. So, let’s look at some of the central ideas or claims that come from the structuralists, and then we can more easily understand what the post-structuralists’ viewpoint is. Structuralists claim that there is a structure inside everything; there’s a structure to the way that languages operate, there’s a structure to the way that, as the central figure in structuralism, Levi-Strauss wrote about in his books, he claims there is a structure to how things like meals are constructed in specific cultures. He analyzed the specific underlying structure, such as opposites in different contexts, for example, cooked/raw, nature/culture. The idea of opposites is a structural aspect. When it comes to Natural Language processing, for example, machine learning models don’t understand the words in the traditional sense. Instead, they analyze the statistical relations between words across a large amount of training data. To a structuralist, this is evidence of a “latent structure” of language. Furthermore, these relationships are reinforced with sentiment analysis, which is based on structural logic; words like ‘good’ and ‘bad’ are understood by their position in opposition. It’s this structure, the opposites, for example, that are under scrutiny by the structuralist. Optimization in this context is then “discovering” the most efficient or probable solution that already exists inside the structure.


    In contrast to this, the post-structuralists deny that optimization is neutral. Post-structuralists claim that every act of “optimization” constructs the very field it claims to reveal. It isn’t finding the optimal path in a stable system; it’s deciding what counts as optimal in the first place. Let’s look at these two viewpoints when we apply them to a modern social media site like TikTok. The structuralist reading would indicate that the algorithm discovers which videos are inherently engaging. While the post-structuralist reading would indicate that the algorithm produces the conditions for engagement by privileging specific media content/formatting of said video. What counts as engaging, things like comment count or view duration, are a construction of the algorithm and drive the content to fit that mold to appear as engaging to the algorithm. Because categories like engagement, best, or optimized aren’t natural, they’re definitions chosen by developers or institutions. Optimized for whom? Optimized for what?


    From a post-structuralist lens, this means optimization isn’t universal; it only ever exists in a context, and it is goal and data-dependent. Along with this, all of these models are constructed by specific groups of people, with their specific purposes. Following that logic, it’s fair to say that the “best” or “optimized” result is the one that meets their chosen metric within the limits of their data. This is not always nefarious; sometimes it’s the result of technical trade-offs for the performance of the model. But these trade-offs are still decisions based on perspectives embedded in the “why” those trade-offs were made. We must ask ourselves again, optimized for whom, optimized for what?


    If we take a closer look at exactly how these algorithms’ results go from prediction to prescription to create feedback loops. Going back to our first article covering predictive policing, the authors cover how systems that use what is known as “batch analysis”. These are systems that train on data, then later, once newer data has been collected, the model is updated by training with the new data. Systems such as this are susceptible to falling into feedback loops because the newest data you’re training the model on will have been influenced by your redistribution of resources after the first results of the model. Because of this, your model is being trained on data that has been reinforced by the data of the model. A more readily available example might be social media ranking content that optimizes for clicks, even if it has adverse effects on the diversity of ideas. Post or videos that get lots of views get pushed to get even more views, and the feedback loop is already kicked off. These types of effects push reality towards what the model can measure and reward. These feedback loops actively produce the reality they are supposed to reflect. These feedback loops don’t just change outcomes; they change what we think is possible. In Part 3, I’ll explore how this narrowing of possibility connects to ideas from Baudrillard, Debord, and Foucault: optimization as simulation, spectacle, and even governance.

  • The Algorithm Made It So Part 1: Performativity and How Algorithms Shape Reality

    The Algorithm Made It So Part 1: Performativity and How Algorithms Shape Reality

    This is the first part in a 3-part series I am writing on the concept of performativity and how it interacts with machine learning and computer algorithms.

    In this first part, we’ll explore how algorithms enact performativity in the world. In Part 2, we’ll unpack how embedded worldviews shape their definition of ‘best,’ and in Part 3, we’ll look at how we might resist or redirect these feedback loops.

    We live in a modern world where, in every aspect of life, progressively more sophisticated algorithms and machine learning algorithms are used to make real-world decisions. From the YouTube algorithm deciding which videos are shown and to whom, or in the case of the article you can read here, where predictive policing leads to empirical runaway feedback loops. In this study, researchers look at a common tool currently used by local law enforcement agencies to predict where to deploy officers, in an attempt to anticipate the trends in crime in the city. The system that local police agencies use for this is called ‘PredPol’, and this system operates by looking at reported incidences of crime and arrest counts from a specific neighborhood. It is then updated with new data in batches, called batch analysis. They demonstrate how systems such as this are susceptible to runaway feedback loops, where police are repeatedly sent out to the same neighborhoods, no matter what the actual crime rates are in the neighborhoods that you’re monitoring. Some key results I found interesting from this study are that, even if all the neighborhoods are assumed to have equal crime rates (or nearly identical), this system still will have a runaway feedback loop and send police disproportionately to one neighborhood. They successfully found a workaround in this system using statistical modeling to reduce the effects of the feedback loop. In this exact case, they needed to reduce the amount of significance given to the arrest counts for the neighborhood. Doing this allowed them to successfully predict crime rates for simulated neighborhoods instead of experiencing runaway feedback loops.

    This is not simply a misjudgment by the algorithm, but I would describe it as a modern example of performativity. This concept originates in linguistic philosophy and has spread to many other areas of study. For this article’s purposes, we will consider performativity to be “Generally when certain actions produce effects or bring about a new state of affairs”. We will also examine how the optimization of these algorithms can lead to behaviors that reinforce the algorithm’s model. We will pay very close attention to how “optimization” is generally synonymous with “best” in these algorithms’ eyes. The “optimal” outcome is the best only according to the goals, constraints, and worldview embedded into the model. This reminds me of insights from the philosopher Immanuel Kant, which is the idea that our minds bring assumptions to the data, and to what we perceive to be the best for the situation. These innate assumptions and worldviews are built into the models we construct.

    Let’s look at performativity a bit more before we move on. It has its origins in linguistics, and was not originated by, but was popularized as a concept by John Austin in the 1950s. Originally, Austin was looking at statements in society and everyday life that, once uttered, would, merely by the words having been spoken, change some state of affairs. An easy example of this is when someone dubs the name of a ship, “I name this ship The Titanic.” By announcing this, the ship is then named so. This concept was followed by more philosophers in linguistics and critical theory, like John Searle and Jean-François Lyotard. Judith Butler’s work on gender identity using performativity gained a lot of traction. I think some of the best examples of modern performativity would be to look at economics. This generally outright states that professionals and popularizers affect the phenomena they purport to describe, causing a feedback loop of new results that were based on the model feeding into the model. This article does a great job of outlining this example in economics in great detail. They claim performativity occurs when the act of modeling or predicting alters the behavior of the subjects being measured, aligning them more closely with the model itself.

    This feedback loop, already visible in finance, education, and governance, becomes especially powerful when embedded in AI-driven optimization systems. The authors have many great examples, including my personal favorite being retail inventory algorithms. Large retailers use predictive algorithms to stock products based on expected demand. As stores adjust stock based on the algorithm, consumer purchasing patterns adapt to what’s available, reinforcing the algorithm’s original prediction. As a result of performativity, these systems shape the range of actions that are perceived as “possible” or “worthwhile”, based on the algorithm’s output. This has real-world effects on people like the MMO video game Old School RuneScape players, and they speak about how they actively feel like they are wasting time not doing methods of training deemed “efficient”. To other, more concrete examples, like a research group not being selected by an algorithm to receive funding for cancer research. This feedback loop can affect both the seemingly trivial and the critically important.

    Whether it’s a police patrol route, the products on a store shelf, or even the way we train in a video game, these models don’t just reflect the world; they make the world. And as we’ll see next, the way an algorithm defines ‘best’ can quietly set the boundaries for what is even possible.

    If algorithms can reshape the world to match their internal logic, the next question becomes: whose logic is it, and what counts as ‘best’ in that worldview?

    Part 2 next week

    Part 2 Here

  • Why I Started The Quiet Syntax

    Why I Started The Quiet Syntax

    There’s a quiet kind of drive that never really stops. It’s not loud or flashy. It doesn’t ask for praise. It just compels you, day after day, to understand more, to refine your thoughts, to learn something new even if you don’t know what for.

    The Quiet Syntax is a home for that drive.

    I started this blog for two reasons. The first is simple: I want to document my personal growth, as a thinker, a programmer, a human being. Progress isn’t always visible in the moment. But looking back over time, written thoughts become a record of change, of sharpening clarity, of new frameworks slowly emerging where only questions used to be.

    The second reason is less about me and more about connection. I’ve spent most of my life feeling out of sync with the conversations around me, as if I’m asking questions others don’t think to ask, or trying to explain things that sound strange out loud. This space is my attempt to share those thoughts anyway. Maybe some of you are wired similarly. Maybe you’ll see yourself here.

    Learning as a Response to the Unknown

    At the heart of this blog is a philosophy that’s hard to explain quickly, but worth unpacking. It centers around the idea of the unknown as affect without object.

    We often treat “not knowing” as a temporary inconvenience, a gap to be closed. But I see the unknown differently. It’s not just the absence of knowledge, it’s a presence, a kind of pressure on the psyche. It moves you before you understand it. You feel its weight before you even name it. In that way, it’s affective, it changes you emotionally, intellectually, but it has no clear target. No object. No outline.

    This affective unknown is what drives me to learn. It’s not curiosity in the traditional sense, it’s more like the need to relieve a tension you don’t fully understand. And sometimes, learning is the only way to convert that pressure into clarity.

    When I encounter something I don’t know, whether it’s a concept in philosophy, a new function in Python, or a theory in linguistics, I don’t just want to master it. I want to fold it into myself, see how it reshapes my thinking. That’s growth, for me. Not checklist mastery. But transformation through contact with the unknown.

    What to Expect

    This blog will probably drift between technical posts, abstract reflections, and whatever hybrid forms emerge when those worlds collide. I’ll be writing about:

    • Programming, data, and systems
    • Language, logic, and semiotics
    • Ideas that don’t quite fit anywhere else

    But more than anything, I’ll be writing to trace the path of becoming, not toward some final goal, but through the syntax of quiet transformation.

    If any of this resonates, I’m glad you’re here.