Sunday, 14 January 2024

Baconian Strategies, or the Possibility of Originality in AI Art (MidJourney 2)

 

As I discussed in the previous part, a key philosophical question for the future of AI image generation and art is the question of representation. Will future iterations of MidJourney, DALL-E, Stable Diffusion, etc, just become ever more proficient at translating complex prompts into accurate visual representations, with a consequent loss of the wildness and "ghost in the machine" type interventions, which for me are its main selling point? Or will the technology retain, or even begin to coalesce around its own artistic tendencies, ending up less of a tool and more of a genuine collaborator?

One artist who appreciated more than most the impact of new technology on artistic practice was Francis Bacon. In interviews he used the expression "deepening the game" to describe the approach painters needed to take in response to the advent of photography and film, which had during his lifetime thoroughly challenged the role of painting in the visual arts. After all, if photography was now the dominant medium for representing reality, what could traditional painting, and especially figurative painting offer that went beyond mere illustration? Bacon's answer to this challenge was to develop a style and technique of painting that not only produced works of unsettling power, but which also assimilated photography into their mode of production. He made no secret of the fact that photographs - of his friends and lovers, from newspapers and magazines, and even pictures of old masters like Velazquez' portrait of Pope Innocent X - frequently formed the raw material for his own images. While other artists of the time abandoned the figure and moved into abstraction, Bacon took on photography and in a sense made the camera into a collaborator.

Artificial intelligence poses yet another challenge to artistic production on perhaps an even greater scale to photography. It not only takes aim at the skills of the digital artist or graphic designer, but puts into question the very notion of artistic authorship. As such, I've spent a fair amount of my CPU time on MidJourney attempting to apply some "Baconian strategies" to how I use it and the types of images I've tried to make. Some of this has consisted of pictures produced in Bacon's style - with a contemporary twist, while at other times I've attempted to develop a more general approach to image generation in which Bacon's concepts and insights on art are applied.

I should say from the outset that each  version of MidJourney has a different take on Bacon's style which reflects the differing levels of complexity in the coding and the way the developers have trained it. Generally speaking, versions 3 and 4 will give you more abstraction, dissonance and incoherence, whereas the different iterations of version 5 and the recently released version 6 will offer more realism, better figurative representation and more straightforward expressions of compositional coherence. As such, producing Bacon-esque images is not as straightforward as asking for so and so "in the style of Francis Bacon". Indeed as you will see, Bacon's painting themselves must often be brought in to the prompts to provide guidance and set examples for the AI to follow, as well as moving between versions and using counterintuitive phrases to excite more interesting results. Take for example the two images below produced by version 6, both of which use the prompt "Francis Bacon painting of a dog sitting peacefully in front of a large open window", with the addition that the one on the right also includes a jpg file of Bacon's Dog (1952) as part of the prompt.

Neither of these images really captures much of Bacon's style. The most one could say is that some of the "brushstrokes" for the surrounding rooms are vaguely reminiscent of how Bacon would situate his figures perhaps in the 1940s and 50s. I also like the way the flooring from Bacon's painting has been remixed into the picture on the right. But the figure of the dog itself is just too well rendered, too illustrative, and just, well, a bit dull. Switching down to V4 however gives us a very different result. The picture on the left uses the exact same prompt (including the image of Bacon's Dog) as the picture above on the right, but gives us an entirely different style which in my mind much more characteristic of Bacon's compositions and stark framing of the figure. The dog is perhaps still too much of a direct representation but the whole thing is more obviously modernist in execution. The translation of the window, however, has entirely lost its realism, and is now only hinted at with the arrangement of line and colour in the upper half of the picture. 

Finally the image on the right adds a longer description in near-natural language: "Francis Bacon painting of a dog sitting peacefully in front of a large open window, twisted and distorted anatomy. Mirrored reflections of 1940s interior room in the background. Geometric lines frame the main figure. Yellow, black and white colour palette with streaks of red and blue". Now we're getting somewhere. The dog has been messed up a good deal with odd proportions and the surreal inclusion of a lizard-like tail. The requested colours are present, though the background and interesting use of forced perspective lines is perhaps a little busy. But overall I'd say this is a fairly good attempt at a Bacon-like dog, although the extra data in the description - such as the mirrored reflections and 1940s interior - has not directly translated into the image.

This example demonstrates one of the key attractions of MidJourney, which is its penchant for chance and indeterminacy when interpreting the data fed into it. The most convincing of the prompts above includes a natural language element, with subject-predicate and statement parts, plus an actual image approximating the subject and style desired. Nevertheless, the output image is something quite different, neither exactly what we asked for, nor a complete departure. Getting to that point took me six prompts which each produced a grid of four images, many of which were no-where near the desired outcome. It's this element of chance and accident that brings MidJourney into true Bacon territory. Don't believe me? Take it from the man himself: The following is an exchange between Bacon and the art critic David Sylvester in an interview in 1966.

FB: I want a very ordered image, but I want it to come about by chance.

DS: But you're sufficiently puritanical not to want to make the chance come too easily.

FB: I would like things to come easily, but you can't order chance. Because if you could, you would only be imposing another type of illustration.

DS: Are you aware of the moment when you find you are becoming free and the thing is taking you over?

FB: Well, very often the involuntary marks are much more deeply suggestive than others, and those are the moments when you feel that anything can happen.

Those of you conversant with the history of the surrealist movement will recognise the ethos at work here. Chance encounters, the unconscious, or the marvellous accident, were all themes and techniques developed in various ways by the surrealists (and before them in Dada) as a means to get behind - or beyond - the usual modes of representation in Western art. Bacon's courting of chance and accident in his work in part continues this tradition, but it also formed a key element of his personality, which came out in his often tempestuous relationships and predilection for gambling. But as we can see in the quote above he also wants to remain in overall control and desires "a very ordered image", not an abstraction produced by chance encounters between paint and canvas. Similar tensions are observable in his private life, which despite the reputation for sexual and alcoholic excess was similarly marked by a tendency towards control. I argued in an essay in 2018 that rather than being a simple facet of his character, this constant tension between chance and order constitutes Bacon's Form-of-Life; the wellspring if you like, from out of which came his remarkable paintings.  

Making a picture in MidJourney without an obvious subject can lead in odd directions. One way I like to start is by using the blend function in which up to five images can be fed into the AI which will then attempt to produce a single image which melds the aesthetic qualities of those in the prompt. It does this without any additional word prompting, instead relying entirely on its own internal logic. How exactly it does this, which elements it focuses on and which it ignores, is more than a little mysterious, and again like so many things with MidJourney, the results tend to vary between versions. One thing I've noticed consistently is that it will latch on to any anatomical shapes it can find and centre the image around them. If there is no central figure in the images fed into it, it will sometimes create one by anthropomorphising structures or textural elements. This can lead to a lot of strange results where limbs, faces or composite body parts will appear. This is usually the result of feeding highly non-comparable images into the prompt; but can, if you choose them wisely, lead to a fairly unique starting point to build your composition and improvise around results using additional word prompts or substituting new images into the prompt to take it in different directions. 

The image to the left started out as a blend which included some older Bacon-type images produced by MidJourney, plus some more photorealistic figurative data and "textural" matter. I call input images textural if their composition strongly influences the overall aesthetic style of the output without defining the subject or dictating overall coherence. Such images are often abstract or contain strong colours. After I had the initial blended image I began to push it more deliberately using word prompts in addition to the original images. First I started with “A young woman sits in a gloomy hotel room", before adding a description of her clothing and extra aesthetic filler such as "soft pastel thermal colour palette, variable contrast" and "dark urban skyline, shadowy figures, architectural drawings". Eventually I fed one of the resulting images back into the blend alongside material that suggested more of an urban setting. One of the images in the resulting grid had altered the position of the figure such that it looked reminiscent of a pole dancer. With this "involuntary mark" made I changed the word prompt pushing it more towards such an outcome, before finally switching in a textural image into the prompt, which I knew would have the effect of scrambling the anatomy of the figure (and losing its head!). The result was the image you see here; a unique, unplanned composition, that while utilising the indeterminacy of the AI's training is also ordered, thematic and recognisably figurative.

Bacon also employed a degree of improvisation in his work, gambling - as he saw it - a painting's fate on the next dramatic brush stroke. If it went wrong and could not be recovered he would frequently destroy the painting. But this approach could also yield desirable (though unconsciously produced) results, which for Bacon were a route away from illustration and towards what he described as a more violent return to reality itself. The most famous example of Bacon using this method is his Painting 1946, sometimes known as the butcher shop for its depiction of sides of beef. In an unusually detailed account of how one of his paintings was created he describes the intention of depicting a chimpanzee in grass, before then attempting to paint a bird of prey landing in a field. None of this came to fruition and in his laboured attempts he had produced such a strange mass of marks that it seemed as if the figurations he finally arrived at - the final assemblage of which owes a good deal to Poussin's The Adoration of the Golden Calf -  had formed by an act of his unconscious.

Francis Bacon, Painting 1946

What Bacon describes here - and what I've tried to apply with my image of the pole dancer - is I think one of the best strategies for getting unique images out of MidJourney. Images that do not privilege prompt following or exact representational translation but nevertheless aim at an ordered final result. It's also not random, since in order to get good results the user has to learn the aesthetic and interpretative tendencies of the AI and how to provoke it into making images with the desired artistic qualities and level of compositional coherence. It's also rarely instantaneous and involves a lot of tweaking and remixing of prompts. Sometimes, it leads nowhere worthwhile, and one is always pushing up against MidJourney's more conservative impulses, which are usually kitsch and highly influenced by American popular culture. When this happens, there is no choice but to abandon the thread and start off once again in a completely new direction.

In practice this involves, so to speak, mastering the dark side of MidJourney; deploying dissonance and contradiction, and getting a feel for the kinds of prompts that send the AI away from its basic representational paradigm. For example getting anything vaguely explicit into your compositions isn't easy. MidJourney's interface on Discord is moderated to exclude explicit images or words from prompts. This extends to slang or even phrases that could count as a double entendre. Despite the moderation, which also varies depending on the version you're using (earlier ones are generally more permissive for words, but also can't tell the difference between the Venus De Milo and a pornographic image, go figure) the system will frequently create images with explicit content, usually around the naked human form. In short, MidJourney has a predilection for boobs, and they sometimes pop up when you least expect them. It's just another little morsel of indeterminacy that makes using it fun. And on the plus side you're able to continue Western art's long tradition of fascination with the female form.

 
Testing the limits of the moderation, both around text and images is worth doing as it can be frustrating to have your prompts knocked back for seemingly innocuous infractions, e.g. the phrase "tight white" in any context cannot be used. This being said some surprising things do get through, so roll the dice, you never know what will come out the other end.

The moderation is meant to extend to horror and gore but there seems to be some intelligence to the way it is applied so that photorealistic images of injury or warped anatomy are excluded, but more painterly or cartoonish prompts get through. This is certainly beneficial if like me you want to produce figures with distortions or that generally look a bit roughed up. Combine this with the relative ease in making if not explicit then at least vaguely erotic images, then you have a good set of levers for adding more adult content into your compositions, and this is no bad thing. As I described in the last piece, the majority of content produced on MidJourney would in my opinion fall into the category of infantile. Thus I'm all in favour of pushing it towards a more mature audience. To my mind using cutting edge artificial intelligence to raise the aesthetic standard of erotica is a more venerable project than using it to produce endless Pokémon fan art.

 I've grouped the images I've created using these strategies in to two broad categories: After-Bacons, which try to produce images in Bacon's style but with subjects more in keeping with contemporary life in 2024; and Meta-Bacons (yeah I know it's a lame tag) which apply the conceptual toolkit described above to make images that do not look like paintings by Francis Bacon but nevertheless owe something to the ethos and approach to image-making. 

One example of the former I'm fond of is this picture of a man with a newspaper standing in a modern glass box apartment. Some of the elements, such as his white trainers and the fact he's holding a newspaper - which I recognise is hardly symbolic of the digital first era - were directly stated in the prompt, as was the high setting over the city; but the rest of the composition came about through trial and error, and improvising around marvellous accidents thrown up by MidJourney itself. I especially like the mirroring of the buildings outside the window on the presumably hyper-shiny floor of the flat. The figure is a delightful quandary. He looks anxious and out of place, despite the attempt seemingly to look relaxed with his newspaper, which in the way he's holding it appears as if he were trying to hide something in his hands. The clothing is also amusingly misplaced. Is that shorts over cut off leggings? The trainers are nicely done and draw attention to the shadow on the floor which instead of tracking the figure's silhouette, forms a incongruous rectangle. This is a scene that nicely captures the ennui and out-of-jointness amid superficial luxury that is symbolic 2020s Western culture.  

And as an example of a Meta-Bacon the image on the right, which I've titled An Unexpected Visitor, has all the qualities I'm looking for. The prompt developed over a couple of days and multiple iterations during which different colour palettes were tried out and the arrangement of the figure and background played around with. I'd been experimenting with trying to add something of Man Ray's surrealistic black and white photography to the mix and wanted a slightly erotic theme around a woman putting on stockings in a bedroom. the trouble was that most of the versions either produced a too photorealistic representation or the result was just too abstract and unnatural looking. One of the images included in the final prompt featured a classical statue without its head - which I'd included to give a nudge towards the body shape I wanted - but instead this seems to have filtered down into the final picture in the wonderful accident seen above, where the woman's head is unnaturally large and constructed as a kind of glowing neon sign. The head, turned to the figure's left, wears an anxious expression, and the left hand is held up at the throat as if startled by a sudden knock at the door. The dishevelled state of the bed mirrors the degraded framing of the image, as if it were a collage of old photographs. 

The figure itself is beautifully formed, non-symmetrical in appearance (which is something MidJourney often struggles with) and has these suggestive marks that could be either surface damage (in a photo collage) or symbolic of violence. Finally, there's the odd sprite-like entity at the head of the bed, that in its diaphanous appearance seems hardly there. It's an image that's full of intrigue and tension, combining a number of styles into a striking final result. As with the picture of the man in the apartment above, it grew out of a combination of intention and accident, where unintended results could be built upon and improvised around. I had only an outline of the subject I wanted, and the decision when to settle for a final image is somewhat arbitrary, after all, images can be endlessly remixed and new versions produced. In a similar fashion to how Bacon claimed his paintings were "let out", pictures on MidJourney are not so much finished as abandoned.

The jargon around AI such as ChatGPT and MidJourney would have the user known as a 'Prompt Engineer'. One influencer on Twitter even speculated this could be a future job title for anyone whose livelihood was under threat from the new technology. Sadly I think for many that will be the case,  much as previous rounds of automation rendered older skills obsolete, those former hands-on builders and creatives were pushed further away from the product of their labour, and now mostly supervised the machines that actually do the work. The threat now is that the inner logic and tendencies of the AI - which, we should remember, is never neutral, and can only approximate our world, with all its prejudices and darkness included - will come to supplant the inexhaustible creative potential of human beings. Humans do require paying, MidJourney just needs a monthly subscription. Thus you can be sure that company executives across the creative industries are gleefully adding up the possible savings from eliminating those same pesky humans from the production process. If the public gallery on MidJourney is anything to go by, then, MidJourney is well suited to being a workhorse for the mainstream of the culture industry.

But, as I hope I've been able to show, this technology also has artistic potential that can be brought out in collaboration with a user that seeks - so to speak - to meet the ghost in the machine halfway. I think this form of use - which rejects the standard representational paradigm, whereby AI should be a slave to our most asinine dreams - is a real and open possibility. It's certainly no more outlandish than the idea of artificial general intelligence or the notion of the human/AI singularity that excites transhumanists. Art, like cruelty, is one of the quintessential human things, and the leap from the Lascaux cave paintings to Duchamp's Fountain is perhaps greater than what artificial intelligence will do for us. 

My bet is that Francis Bacon would also not have been overly disturbed by AI image generation. There is no possibility of perfect prompt following since the meaning of the words pumped into it can have no objectively translatable aesthetic counterpart. To suggest otherwise is a category error. It's the same category error made by people who claim music can be reduced to maths. Adding image prompts into the mix only compounds this basic truth.  Consequently, this means is that there will always be a fundamental degree of chance and indeterminacy at work; a ghost in the machine, that by way of certain clairvoyant techniques can become the collaborator in an artist's work. If he were around now perhaps Bacon would have used MidJourney much as he used photography; as a source of inspiration, of subjects and new ways to show the figure. It will undoubtedly accelerate cultural production in the most shallow and commercially driven parts of society. But alongside that certainty there is also the possibility of taking up the challenge and deepening the game.

 

 


Monday, 1 January 2024

Concepts of Post-Liberal Politics: Activist Bureaucracy?

 I've seen this term popping up with increasing regularity on Twitter, mainly associated with conservative discourse around the influence of "woke" ideas on State institutions. It appears to be a mostly Anglophone phenomenon, following in the wake of culture wars around gender, so-called critical race theory in schools and universities, and issues around immigration and crime. It initially struck me as something of a misnomer since the two terms, activist and bureaucracy, usually denote two distinct relationships to institutions and two distinct positions within Liberal democratic politics. 

 


The Tweets above in part continue a longstanding tradition on the Right of associating State bureaucracy with both the Left generally and with anti-democratic tendencies within the Liberal State itself. Perhaps the classic statement of this position was made by arch free-marketeer and Ordoliberal, Ludwig Von Mises in his On Bureaucracy (1944), where this father of Neoliberalism blamed government suppression of the profit motive through regulation (and New Deal style welfarism) for the rise of bureaucracy in the US. The association, however, with political activism adds a contemporary spin, perhaps emanating from out of recent US social conflicts, such as the Black Lives Matter protests in 2020. The association of activism and established state bureaucracy might also reflect the dominance of Post-War US political culture more generally, in which the Left has traditionally been most effective in the form of activist movements rather than organised party type structures which have had greater impact in the UK and Europe.

The idea of activist bureaucracy also signals a number of currents in Western political culture around the legitimacy of institutions and fears around an ideological or anti-democratic 'capture' of State functions which are meant to be nominally neutral. In the UK it's now commonplace to hear of universities, the health service, or other such public body being "captured" by activists. What this claim seem to imply is that the public body in question has ceased to function as an expression of the general will or common good and instead espouses an ideological position (i.e. particularistic), with consequent policies negatively affecting those who do not subscribe to the ideology in question. This view is strongly associated with criticisms around Cancel Culture and the influence of diversity, inclusion and equality initiatives within those same institutions. 

Despite the above now being a fairly well understood current in contemporary Western politics there is still something intuitively jarring about the articulation of activism and bureaucratic modes of organisation, especially in the context of struggles around gender and race, which given their emotional, and - in the case of gender - highly speculative nature, would not seem to lend themselves to the form of rational, rules based managerial techniques associated with bureaucracy. Then again if the Party/State bureaucracy of the USSR or contemporary China are expressions of those States interpretations of Marxist/Leninist thought (something the Right never ceases to deride as the epitome of ideologically driven politics), then the necessary association of rational management with neutrality would appear less secure. 

But perhaps this is the central point and the reason the notion of an activist bureaucracy garners so much online rancour from garden variety conservatives and right leaning Liberals hankering after the good old days of free speech and "rigorous debate". In short, the appearance of an ideologically driven, evangelical (I actually think this better expresses the phenomenon than the term activist) bureaucratic State apparatus within the framework of liberal democracy puts into question long standing assumptions about the rational, scientific, and value neutral quality of modern State administration. It reveals to the based dissident or red pilled conservative that there is really no such thing as neutrality, and the collapse of the centre ground, rather than eclipsing universalist one-nation government, actually reveals that government as always-already partial, partisan and particularistic in nature. This is the reasoning at the heart of so much chatter about the deep State and the limits of electoral democracy to really influence the course of events.

Marxists have long understood that what the ruling class calls common sense is only the ideologically driven worldview of the dominant property owning classes. Free market Liberal economic doctrine had this character to it, at least until the financial crash of the late 2000s, and is still pushed by elements within mainstream political parties, despite the last fifteen years of near flat growth as a result of the casino going broke. Imputing an 'ideological' character to the beliefs of one's political enemies is a commonplace strategy as much on the Right as it is on the Left. We always wish to believe our side has a better grasp of reality, whereas our opponents only deal in pipe dreams and mystification. In doing so we immediately devalue the political aims and methods of the other side, so much so that any attempt to find common ground would be futile. Thus the common sense / ideology opposition is a real political antagonism that cannot be resolved within the deliberative mechanisms of the Liberal State.

Consequently this adds a sharp edge to the accusations around activist (read ideologically driven, partisan) bureaucracy. Since what is at stake is a battle over different visions of reality, there can be no debate or political resolution which might find a mutually acceptable compromise. There simply isn't enough common ground. We see this absolute polarisation at work in all the major social struggles of the day, whether it's trans activists claiming critics of "gender affirmative healthcare" are supporting genocide; or opponents of mass immigration accusing lawyers and human rights activists of engineering the replacement of white Europeans. The lack of a leading political culture to suppress these extremes while mediating between different value claims and forms-of-life is one of the reasons all social conflict now looks like a life or death struggle.

One thing all sides have in common, which rises to the surface in fears about activist bureaucracy - and to a lesser extent Leftist paranoia about institutional bias and hidden Fascist agendas, is the latent Western myth of a neutral and benign civil power which somehow rises above the multitude of value claims and competing forms-of-life to resolve, like a beautifully engineered machine of governance, all those conflicts that threaten to dissolve the commonwealth. Everyone wants the administrative state to rule according to principles of sound rational governance, for its authority to be generally recognised, and its decisions to be accepted as Just. But this is a fantasy, because at a time of crisis above all what we want is for the administration to rule in our favour, and if it doesn't then it simply can't be rational or neutral or Just. 

We live at a time when all authority is being put into question by the pace of technological and sociological change, its foundations creaking and crumbling, its officials impotent against forces they cannot control and seem not to understand. Human scale authority, in which individuals recognise themselves and their communities, and is accepted as legitimate by a majority of the governed, no longer exists in the West. In its place has appeared the figure of the digital swarm and of global civil war, a conflict where the belligerents traverse national boundaries, can consist of a single lone wolf or an army of tens of thousands, and where weapons range from words on social media, to legal suits, to HR policies, to Kamikaze drones and ballistic missiles.

All the old certainties and boundaries are dissolved, the myths by which we formerly governed ourselves no longer hold sway. One thing is sure, that no-one anymore cares much for democracy. Whether you're a gender critical campaigner trying to get "gender ideology" out of schools, or a racial justice activist wanting compulsory unconscious bias training in all government departments, your aim is not to win at the ballot box or put forward a program for public scrutiny. Your aim is to bend the vast invasive power of modern State administration to implement your will without recourse to the messy and less certain process of consensus building. Everyone wants their Caesar in HR. We are uncontrollable societies of disaffected individuals, as the late Bernard Stiegler put it. It is only in this context can the seeming misnomer of activist bureaucracy be understood.

Friday, 29 December 2023

Half way along the road we have to go ...(MidJourney)

 

Since April I've been using an image generating artificial intelligence program called MidJourney. Previously my experience of the cutting edge of AI had been putting a few basic requests into ChatGPT, none of which I found particularly inspiring. I'd discovered MidJourney via a Twitter post from Mary Harrington who was commenting on the "nefarious" atmosphere of some of the purportedly more photorealistic images produced by the AI. The images in question were of groups of attractive young women, seemingly hanging out at a party. The composition was naturalistic, like a candid shot taken with a old Polaroid. Only when I looked closer did the figures begin to lose their air of authenticity. The limbs were not quite in the right orientation, hands were slightly too small or large, and most obviously at least half had missing or extra fingers. There was also something about the expressions on the faces of these digitally created nubiles that was, well, a little off, like the expressions on waxworks but far more fleshy and uncanny.

Harrington was right, there was something nefarious about these pictures, like the photographer had captured a scene never meant to be exposed to public view. Anyway, it was enough to peak my interest and have me sign up for a few minutes CPU time on the MidJourney server, which is accessed via the Discord app (I have no prior experience using Discord, which is apparently mainly used by gamers). I soon came to realise that quirks of anatomy are a common feature, particularly of the earlier versions of MidJourney - which as of this month is up to version 6.

Like all of these image generating AIs it responds to prompts fed in by the user, which can be either text, images or a combination of the two. There are also a variety of commands that can be used to tailor the images and to select basic parameters such as aspect ratio. Once a command is submitted the MidJourney bot - which you are in communication with on Discord, usually responds within 30 seconds or so, presenting you with a grid of four images the differences between which will depend on how you have configured the prompt. For example increasing the '--Chaos' level will add greater stylistic variation, while the '--Stylize' parameter sets the level at which the bot will improvise around your prompt. There are also levers for 'Weird' which I've yet to find a use for, and 'Image weight', which allows you to tweak the degree to which either the image or the text part of a prompt determine the output. There are a lot of other little features but these are ones I tend to use as standard.

So what's the point of this thing? What are people doing with it? Well, generalising somewhat, I'd say there were four main areas or genres of use based on the public gallery on the MidJourney site and images which have garnered popularity on social media:


1 - Fake news or comedy images of celebrities, politicians and other public figures. This is where you got those images of Donald Trump seemingly being dragged by police to jail. Now, this stuff has only really taken off on MidJourney since they've released versions that can actually reproduce a good likeness of the person being punked. Version four was passable but required a lot of coaxing by adding real photos of the individual to the prompt, but things really took off with version 5 and especially 5.2 which can produce excellent likenesses of major political figures without the need to feed it a picture of the person. Version 6 (I've only used this for a few days) appears to be just as good, but with better overall complexity and compositional coherence. It's the potential for misuse - or really any use - of this feature that gets a lot of people worried about the possibility of AI undermining democracy or rendering all news images untrustworthy - as if this wasn't already well on the way.


2 - A picture of your dream girl, often suspiciously young looking, and either anime style Asian or Teutonic blonde babe in appearance . Yes really, I'm not glossing this; it's clear this is what's going on, and why should we be surprised that in the world of OnlyFans and chatbot girlfriends that people immediately gravitate towards using AI in this way, as if the developers at MidJourney had a mainline into the deepest desires of the average extremely online male. But what really bothers me about the trend is how formulaic all the images are. It's all waify (or should that be Waifu?) skinny pixie girls or vivacious models of ambiguous age framed in soft light, staring out vacantly at the sex starved prompter on the other side of the screen. None of these kinds of images are pornographic (more on that side of things later), if anything they're just boring in their cute Eurasian / Aryan predictability. I have no doubt that this is going to be a major growth area in AI development as it so neatly tracks the general state of social and sexual atomisation in the developed world. One could imagine going to your doctor with depression born of loneliness only to be prescribed a fully customisable AI companion. It's cheaper than therapy!


3 - Mainstream entertainment spin off type imagery / fan art.  This needs little explanation, and it's perhaps understandable that at a stage in which the people most likely to be using this technology are geeks and the extremely online that a significant proportion of the content will be Star Wars, Marvel, Pixar, Pokémon adjacent type images. So what? it's boring but it's entertainment, and that is what most people will be satisfied with.

 

4 - Decoration. I'm revealing my Baconian prejudices here, but for me perhaps the most uninspired use of the this vastly powerful technology is in the production of purely decorative images, whether they be for marketing purposes, website design, or just stuff people can use at home. It's dull. End of. But it's also possibly the most "everyday" application image generating AI will be used for. This is where we fear for the graphic designers who face an 'adapt or perish' situation when this stuff goes mainstream. In fact it's already going mainstream with Adobe Firefly which will combine generative AI with Adobe's already wildly successful Photoshop and Illustrator products. So, I don't have a problem with using AI for basic graphic design, but it is for me - as the workaday application - thoroughly pedestrian and only likely to track the mediocrity of contemporary culture. 

So what have I used this technology for? What has held my attention over the last nine months and encouraged me to pump about £200 into a monthly subscription and extra CPU time? After the initial novelty wore off, and I realised I couldn't just upload a picture of myself and expect it to produce a perfect likeness in whatever Kabuki drag scenario I chose, I began to explore what you could call its inner aesthetic tendencies. My attitude was straightforward; I wasn't interested in a digitally enhanced version of reality. What interested me was pushing the AI into making images of a world that could not, and perhaps should not exist.


This is where I think the developers of MidJourney show their conservatism and adherence to the basic coordinates of traditional Western thought about aesthetics, since they constantly talk - in their Discord announcements - about 'prompt following' and 'coherence'. What this amounts to is fixing the horizons of the AI to the field of representation. The images the AI is meant to produce should visually represent as closely as possible the verbal and visual data fed into it in the prompt. If you ask for a black cat on a white mat, then that's what it should give you each time, except this is in aesthetic terms a very formal and empty description, which in and of itself could never be said to constitute the salient datum for a unique work of art. Either the same cat on the same mat would be produced each time, which would satisfy very few people, or an infinite number of radically different cats and mats would result, which would show MidJourney to be an engine of arbitrariness and rob the prompt engineer of their status as an "AI artist".  

Now, the developers know this, which is why they've put so much effort into MidJourney's ability to understand natural language prompts and in training the AI to interpolate those prompts with its dataset (or supercluster as they call it) in an increasingly coherent and complex way. The results, as you can see with the comparisons between V4 and V6 are startling. 

Left V4 Right V6 - Prompt: Videodrome, surrealism, cybernetics + the same 4 prompt images

Even so, I wonder what the ultimate end goal is for MidJourney's programmers when it comes to the question of representation? Do they want an tool that if given an essay on the aesthetics of Van Goghs Sunflowers could perfectly reproduce the Dutch masters painting (an idea which makes certain assumptions about art criticism), or do they want an AI that is itself an artist? If it is the former then they would seem doomed to produce endless iterations that track the user's prompts to ever greater degree of representational accuracy, which - since everyone will have a different idea of what that cat on the matt should look like - will never become the reliable tool of representation they might hope for, and will likely never satisfy those users - like myself - that hope AI can offer something uniquely nonhuman to the artistic process.

I don't know if that's what their engineers  want, perhaps some, the more commercially minded of their staff, do. After all, MidJourney is usually promoted as a tool for artists, not as an artist in its own right. If this is their intention then what they'll sacrifice, and indeed some versions have already bordered on this, is the wildness and surprising interpretations MidJourney gives to the prompts, and in particular how it interpolates radically incoherent combinations of text and image.


This has been my entry point to MidJourney, the USP without which I doubt I would have stuck around. You could call it dialectical, though that's probably a bit high flung. The attraction is the interplay between intention and randomness, and the feeling that there's something truly uncanny (a ghost in the machine if you like) behind the coding, that is working with you when you use it. This is also what makes it potentially addictive. You craft your prompt, maybe it takes only a minute, maybe it takes ten minutes, or twenty, before you fire it off into the heart of the rough beast. Then there is the delay while the magic happens; compare it to the gap between a dice throw and winning or losing. There is the slow incremental appearance of the grid, and finally the finished result which you scrutinise for hitherto unknown aesthetic qualities - also count the fingers, if there are meant to be fingers, and especially if there are not. It is genuinely addictive and I think even more so when you're trying to produce weird and unconventional images, since half of the fun is trying to sneak something past the moderators in the hope that'll push the result into more interesting territory.

And yes there is moderation, lots of it. Generally speaking you can't pump explicit images into it, or words that are likely to generate explicit images. That is not to say that it doesn't generate explicit images anyway, it really does, and with very little help; I do wonder whether the moderation is aimed more at controlling the inner tendencies of the AI than of the users! I just can't imagine the supercluster training set and coding is able to be thoroughly purged of the raw material that goes into making up a pornographic image. If it knows what Botticelli's Venus looks like then it knows how to generate nude female figures. To exclude all images of nudity would cut it off from a large swath of the cannon of Western art. This is another reason why I find no use for the --weird command. Just adding extra dissonance to the prompt, in either text or image form is enough to throw the result out to odd places and seemingly activate its inner tendencies and interpretive habits.


For example browsing other users pictures I found that the term (Nether Regions) in brackets, which is a kind of soft English slang for genitals (from the idiom of hidden Hellish things), would push any image with organic content into looking, well, not like actual nether regions, but just more gnarly, as if the biological components were suffering from some awful disease. An odd discovery, and only one of many similar counter-intuitive examples of non representational prompting. It doesn't give you what it says on the tin but it does something interesting and fairly consistently.

It's perhaps saying too much that there is a poetic quality to MidJourney, but it is striking the way in which it interpolates the different parts of a prompt, and can turn grammatically or semantically jarring words into coherent compositions. There is then an internal logic perhaps beyond the human ability to make sense from disparate linguistic and visual data. This is perhaps it's greatest strength as an artistic tool and it'll be interesting to see whether the developers are curious enough or sufficiently philosophically minded not to smother MidJourney's wildness in a race for ever greater representational accuracy.

Tuesday, 28 November 2023

London is a Dead Museum: Pt1. Boroughs


I was born in Camberwell in the London borough of Lambeth, high up on the 8th floor of the Ruskin wing of Kings College Hospital. Upon being lowered to earth I was taken across the nearby border to a house in the borough of Southwark. Later I was to work several short stints at that hospital, labelling samples, ferrying intravenous fluids up and down the Escher-like backstairs of that Victorian structure. In my mid-teens I spent a week clearing out and cataloguing old medical records from a dusty hovel of a room in the basement, my arms rash red from the ancient detritus and mites left undisturbed for decades. Turned out that dank oubliette was paradigmatic of the NHS’s attitude to what we now call data integrity. Years later I trespassed upon a quasi-ruin of a hospital in the process of being demolished in West London to discover wicker baskets full of records secreted in long forgotten places. Somewhere along the line I developed a healthy dislike for hospital environments – as many people do – and have been thankful for remaining largely free of their mixture of bureaucracy and disfunction, interspersed with what devotees of the art call “Healthcare”.

I attended schools in Southwark and the neighbouring borough of Lewisham. These were my three psychogeographical Graces, the tripartite alma mater of my formative years.  I would not live or work outside of their borders until I was 22 years old. This latter fact would count against me as the tide of “anywheres” swept over the Capital during the 2000s. People who had come into the city for work or study would look disparagingly upon someone born here and who chose to stay, as many of my school and university friends had not. But If London was such a great place to be, full of opportunity and “buzz” then how could they begrudge me for having been here all along? This attitude prevailed most strongly amongst the home-counties pavenus, those small town high achievers who took to the city as if they had won a special prize, bringing with them their market town kitsch and insularity. These people were the principle beneficiaries – though not the architects – of successive waves of gentrification which claimed once affordable working class areas of the city, from Dalston and Stratford, to Brixton and Clapham.

One of the most visible consequences of this interior migration has been the replacement of pleasingly down-to-earth and integrated communities – each with their own character - with a network of twee BoBo enclaves, all extolling a variation on the same monoculture of vacuous coffee house chic, faux street food and boutique urban living. Oh! insipid parochial bumpkination, oh Bermondsey and Borough Market, your stalls once laden with affordable produce now overflow with cupcakes and artisan vegan scotch eggs! Oh Electric avenue! whose pioneering illuminations now light the way to “Brixton Village”; the countryside as a military Green Zone, inside the city walls, evicting the traders and welcoming well healed beardy craft brewers and premium Ramen bars.

I attended the opening of Brixton Brewery’s expanded production facility in late 2018. The only non-white faces were in the steel drum band. Pints of Pale Ale held by men in gilets with kid skin gloves. Brixton’s long and chequered history distilled (or fermented if you like) into easily marketable brand names; Coldharbour, Atlantic, Electric; streets once rocked by the riotous voices of the unheard are now as unaffordable and neutered as any of London’s most fashionable boroughs. The average monthly rent for a single room is over £1000, and the rent inflation has driven out many of the traders and small business owners that used to line those streets whose names the brewery has appropriated. It’s not an isolated phenomenon. Across the capital the same process of gentrification has taken place, turning once characterful districts into expressions of the same shallow white middle-class vision of urban life. It’s not that the black and brown faces have disappeared, far from it in fact, but what we now have is a stage-managed version of a multi-ethnic city where expressions of diversity and difference are only acceptable as a live-your-best-life narcissistic consumer product (or career path) for the white middle-classes, many of whom have come to the city late and approach its history and varied communities like a continental breakfast buffet, rather than a genuinely integrated form-of-life. Liberal Progressivism sits happily amid all of this, since the activism it promotes is skin deep and its administrators are made up predominantly of the same laptop class who have an interest in maintaining the status-quo.

One especially acute example of this I experienced last year was in Walthamstow, another once affordable edgeland now being given the hipster craft-beer Bobo urban cleansing treatment. On the long weekend of the late Queen’s Platinum jubilee a friend and I visited the craft beer breweries and their associated taprooms which now sit on the East side of the Maynard and Lockwood reservoirs, forming what is colloquially known as the Blackhorse Beer Mile. Nothing unusual or distinct was seen at first, just the usual mix of “high-class bar snacks” and highly hopped brews in converted railway arches and industrial units. Only later in the evening did we notice the bass that was rocking Exale brewery’s “favela chic” taproom was emanating from a huge dub sound system set up around the corner, outside a local Caribbean eatery and community centre. In a flash of 20 yards we went from white middle-class progressive monoculture of craft beer, gender neutral toilets, and indifferent clientele to an all black street party of goat curry, Dancehall and Red Stripe. We lingered on the fringes before a woman came across and invited us to join in. Clearly she found the unofficial segregation taking place as uncomfortable as we did. So, we stayed for a couple of beers and some bone shaking bass.

Later I asked a young woman working behind the bar of the Exhale brewery what she thought of the spontaneous division. She hadn’t thought about it at all, not noticed. She was of student age, perhaps attending some London college or university, which no-doubt makes a big deal of progressive politics and diversity, and yet she experienced no feeling of being out-of-joint at such blatant racial divisions. Was this perhaps because such divisions were common in whatever small Shires town she was from, or perhaps she just spontaneously preferred white urban monoculture? Whatever the truth, I had never seen the shallowness of modern London’s so-called multi-culturalism so clearly displayed.

In Lewisham, my secondary school – Forest Hill School for boys – was a hugely diverse place and had a high proportion of kids of Afro-Caribbean heritage. This resulted in a sort-of pigeon of Jamaican and London slang becoming the lingua franca of the playground. This way of talking is so common now that it’s become almost synonymous with London youth culture as a whole. But in the early 90s it was fairly new and quite scandalous for white working-class parents whose sons would come home talking like a cockney Horace Andy. It was also independent of hip-hop culture which didn’t take over until a few years later. The point is we grew up together, we blended and took on bits of each other’s backgrounds in a context of open, mutual and non-judgemental exchange. When in the mid-2000s I moved to West London, I started finding myself in pubs and other public places that were almost entirely white. Then as now I found it disconcerting.

As time went by I found myself in such spaces more frequently and in areas with large long-standing minority communities. Lambeth and Southwark for instance, saw huge amounts of gentrification in the first decade of this century. By 2010 I often found myself in newly regenerated pubs and bars that I now recognise as catering for that same white urban monoculture. What was jarring was that this was happening against a backdrop of historically high levels of migration into Britain, especially London. So, although on a purely numerical basis London was becoming less-white, less European; more and more I found cultural segregation and an increasingly boring, white middle-class expressing itself through parochialism and a small-town mentality.

Paul Kingsnorth made similar points about gentrification and standardisation in his 2009 book Real England; “the same chains in every high street; the same bricks in every new housing estate; the same signs on every road; the same menu in every pub”. He wasn’t wrong, and the “blandification” of life has only continued apace, now amplified by a burgeoning digital first culture that seeks to manage your preferences before you even know you have them. But what I think was difficult to foresee in the late 2000s was how a bland monolithic urban culture could rise that doesn’t appear like the usual chain shop takeover of the high street or corporate shopping mall experience. Instead, it appropriates the language of cultural and urban diversity while sanding down all the sharp edges, especially those which resist big finance and the myriad forces driving proletarianization. What you get then is a high street where kitsch sits alongside poverty, where market town twee cohabits with the gig economy, cashless craft beer bars with Poundland. The same powerful corporate forces are present, but now they sit under the surface of a superficially affluent and diverse urban environment which maximises consumer choice and “experience” to the detriment of everything else. As I have written elsewhere, the slogan for this new world, born from out of an unholy alliance between crisis response the DEI industry, is the now unavoidable “Be Kind”. And yet at the same time society becomes more isolated and spiritually atomised. Nihilistic, anti-social behaviour across all ages and backgrounds is now commonplace.

Gentrification, infantilisation, disneyfication, gigification, museumification, civilisational decline; there are many terms that try to capture this phenomenon. All express something of the truth of London’s story over the past three decades but perhaps none capture how it feels to experience these transformations first hand. How a home can become an alien and unliveable place, and why so many people who were born here have chosen to leave, including myself.

Saturday, 18 November 2023

All Aboard the Arc

 


I recently investigated the website of the Alliance for Responsible Citizenship (ARC) who held a conference in London in October with a list of ‘dissident’ speakers - of both Left and Right persuasion – that included Jordan B. Peterson, the historian Niall Ferguson, Former Prime Minister of Australia John Howard and current Tory MP Miriam Cates. Framing itself more as an odd-ball think-tank than a party or lobbying organisation they include on their website a collection of questions which readers are invited to respond to and send to the organisation. ARC, as the acronym suggests are about renewal, presumably against a backdrop of cataclysm or eschatological fervour. But they also quote Martin Luther King, Jr's bon mot “the arc of the moral universe is long, but it bends toward justice”; which evokes the long durée of historical progressivism. In short, like so many of these Right leaning conventicles they are a confused mishmash of influences and styles. Nevertheless, most of the questions here assume concepts and forms of thought drawn from garden-variety (or perhaps late 19th century variety) Liberal democratic and free-market orthodoxy. More on account of boredom than serious engagement, I set about in full doomer mode to respond to their questions.

 

Can we find a unifying story that will guide us as we make our way forward?

A story that unifies a nation or community can only come from out of that community's specific historical circumstances. As such, it could never serve as a universalist mythologeme to "humanity" or other such abstraction. Local stories and myths have been the foundation of every historical community until Christianity assimilated them or swept them away, only for its own stories to be superseded by the universalism of progress and global capitalism, neither of which preserve the specificity or spirit that grounds a living people. Only the overthrow of globalist universalism can open the way to new myths and new foundations. We need a multitude of myths not a monocultural mythology.

 

How do we facilitate the development of a responsible and educated citizenry?

One can never be responsible to an abstraction, so, in order to foster a responsible citizenry we must first nurture communities that bind their members through an autochthonous loyalty, that is a loyalty born from out of the everyday social relations specific to a historical community. No one feels responsible in a mass society governed by empty formal values such as those proffered by Liberalism. The hyper-individualism of our present day is incompatible with a demand for substantive social obligations beyond those demanded by the law. If the law could be remade to instantiate a leading culture of substantive values, and those values imbedded in institutions then perhaps we could talk about specific responsibilities. As things stand it is perhaps true that the renewal of responsibility may only come through the cooperation required for preservation of self and kin in the event of a general collapse. Thus, pampered Western complacency must end.

 

What is the proper role for the family, the community, and the nation in creating the conditions for prosperity?

These things have been rendered defunct by contemporary financial and technological global capitalism. Family, community and nation are all social formations that set limits by including some excluding others. Such delimiting is opposed to the liquifying effects of global Capital which seeks to break down barriers to its self-expansion, rendering all persons fungible within a world-market for goods, services and cultural ephemera (delivered in a digital first economy). Now, the old-world formations of family and nation can only be defensive, a bolt-hole to wait out the collapse of the global so that a new and modest idea of prosperity might arise. As Hannah Arendt once said, "I do not love the world, I love my friends". This is a statement of limits and a good start.

 

How do we govern our corporate, social and political organizations so that we promote free exchange and abundance while protecting ourselves against the ever-present danger of cronyism and corruption?

Cronyism and corruption are a natural outgrowth of any organisation once it becomes disembedded from its local role and turns global. Most institutions, whether political, corporate or social have become totally marketised over the previous few decades such that they all work with a similar set of concepts that have nothing to do with any autochthonous community or identifiable set of values beyond profit, loss and market share. It is foolish to imagine any other values could arise while the present set of social relations prevail.

 

How do we provide the energy and other resources upon which all economies depend in a manner that is inexpensive, reliable, safe and efficient, including in the developing world?

There is no technological solution to the problems of energy and resources. The club of Rome report in 1972 which identified limits to growth is still the most clear sighted and straightforward statement of a basic fact. As long as human beings conceive of themselves as separate from their environment, and as long as they harbour the desire to obtain the God-like power to transcend environmental limitations, we will continue on this path to destruction. It is perfectly possible that human beings are marked above all else by hubris and that collective self-destruction is our natural fate. But small pockets of human life may survive and be able to start again. Build back from out of ruins.

 

How should we take the responsibility of environmental stewardship seriously?

It is difficult to take it seriously since it ascribes to us a collective power that is only an illusion. We did not collectively reason our way into climate collapse and we cannot collectively reason our way out of it. Industrialisation and the mass exploitation of the earth's recourses was a process that took centuries and was never planned out on a global scale. The claim that global humanity can manage its way out of its own fate is nothing but hubris and mass delusion. Only small communities, beginning with the family can begin to withdraw and prepare for the cataclysm.