The Right Handles
〞Cómo citar este artículo:
APA (7ª edición)
Spencer, H. (2026, February 25). The Right Handles. En {dp} · doble página. https://herbertspencer.net/2026/the-right-handles. Visitado en:
MLA
Spencer, Herbert. "The Right Handles." {dp} · doble página, 25 February 2026, https://herbertspencer.net/2026/the-right-handles. Accedido el .
Chicago
Spencer, Herbert. "The Right Handles." {dp} · doble página. Publicado el 25 de February de 2026. https://herbertspencer.net/2026/the-right-handles. Consultado el .
Between a phrase and a pictogram there is a gap. Not just a technical gap, the kind engineers like to close, but a space full of decisions that somebody has to make: what to show, what to leave out, how abstract to be, how to arrange elements so the image means what it needs to mean for a particular person in a particular situation. Professionals who work in augmentative and alternative communication navigate this gap every day, largely by instinct. They have experience but not explicit maps. They make good choices but cannot always say why, or point to where exactly the choice was made.
My doctoral research at AUT asks a simple question about that gap: which of those decisions can be surfaced, made visible, and put under someone’s hand? This is not a theoretical exercise. There is a working proof of concept: pictos.net, where anyone can type a phrase and receive a structured pictogram, generated in real time. The SVG that illustrates this post was made there. Everything that follows, the research question, the design decisions, the notion of handles, grows out of what that prototype has already made visible and what it has yet to resolve.
What I mean by handles
When you use a generative model to produce an image, you type a prompt and get a result. You can accept it or reject it. That is one handle, and it is crude: it gives you a verdict but no grip on the process. You cannot reach into the image and say this particular feature is wrong, change it while keeping everything else. You cannot record why you rejected version three and preferred version five. You cannot trace, weeks later, what reasoning led to the pictogram that a student now relies on.
In AAC practice, this matters. Pictograms are not illustrations. They are communicative infrastructure: symbols that people learn, reuse, and depend on to navigate their day. A speech-language therapist who chooses a pictogram for take the bus is not decorating a page; she is making a commitment that this image will mean this thing, reliably, across settings. If a generative tool produces that image, the professional must still own the decision. The tool proposes; the professional authors.
The phrase “the right handles” names the design problem. Not more automation, not faster output, but the right points of intervention: places where a professional can inspect what the system did, adjust what needs adjusting, and leave a trace of why.
The situation
The pictogram libraries that dominate AAC practice1 were built for young children. Their visual register shows it: rounded figures, simple outlines, a representational style calibrated to early childhood. This is not a minor aesthetic complaint. For an autistic young person learning to manage a household, use public transport, or speak up for themselves, childish-looking communication materials carry a real cost in dignity2.
These libraries are also finite. They cover common vocabulary well but cannot anticipate every situated need. When a professional in Santiago needs a pictogram for scan the Bip card on the bus reader, no library will contain it. The workarounds are familiar: combine existing symbols, accept a near-match, draw something by hand. Each workaround shifts meaning, breaks visual consistency, and adds time that practitioners do not have.
Generative image models could close this gap. They are fast, cheap, and increasingly capable. But the way we currently use them, prompt in, image out, leaves the professional with almost no control over the things that matter most: which features are depicted, at what level of abstraction, in what style relative to the rest of the set, and why. Worse, it leaves no record. A pictogram generated on Tuesday and accepted without documentation becomes untraceable by Friday. This is at odds with how AAC work actually operates: a cycle of proposing, checking, revising, and re-validating, where decisions must be justifiable and retrievable because the symbols will be taught, shared, and reused across people and settings3.
The question
My research asks:
How can generative image tools be designed to make the construction of AAC pictograms inspectable and controllable by the professionals responsible for validating them?
This is an interaction design question, not a machine learning question. The models already generate images. What does not yet exist is the interface: the set of representations, controls, and documentation structures through which a professional can author a pictogram using generative means while retaining full visibility over the process.
PictoNet
The project builds around PictoNet, a proof-of-concept text-to-pictogram system I have been developing as part of this doctorate. I have written about its early motivations and broader architecture here before. The system takes a phrase, analyses its communicative intent and visual components, and produces a structured SVG where each element is labelled, editable, and traceable back to a decision.
PictoNet pictogram for “let’s see together the magnificent spectacle of language”. The SVG file contains, alongside the drawing, a full semantic record: the original utterance, its pragmatic classification, the NSM primes, the role of every visual element, and the provenance of the generation. The image is its own documentation.
PictoNet is not a product. It is a design probe4: an artefact built to make professional judgement visible and to evolve through the questions it raises when real practitioners use it. The point is not to deliver a finished tool but to discover, through making, what the right handles actually are.
This means treating generative AI not as a solution but as a design material5: something with properties, resistances, and affordances that must be learned through practice. In AAC, this carries ethical weight. These pictograms directly affect communication for people with complex needs. Getting the depiction wrong is not an aesthetic failure; it is a communicative one. The requirements for professional oversight, transparency, and accountability are not optional extras. They are the design brief.
The SVG as single source of truth
There is a design decision in PictoNet that deserves its own remark. Each pictogram is produced as an SVG file, and that file is not merely a drawing. It carries, inside its <metadata> element, a structured JSON record of everything the system knew and decided at the moment of generation: the original utterance, its pragmatic classification (hortative, commissive, volitional), the Natural Semantic Metalanguage primes6 that anchor the visual composition, the frame-semantic roles of each element, and the provenance of the generation itself (model version, timestamp, licence).
This is not metadata for the sake of thoroughness. It is the mechanism through which the pictogram becomes inspectable. A therapist can open the file and see not just the image but the reasoning behind it: why this element was placed here, what concept it maps to, what the system interpreted as the communicative intent. If she disagrees, she can change the drawing and update the record. If a colleague inherits the file six months later, the rationale is still there. The SVG is simultaneously the artefact and its documentation, the pictogram and the audit trail. One file, one truth7.
What I am looking for
Four things:
- The decision criteria AAC professionals actually use when they accept, reject, or revise a pictogram. Not what the literature says they should use, but what they do.
- A proof-of-concept system that exposes those decisions as inspectable, controllable steps within a generation workflow.
- Evidence, gathered through co-design, of whether the system’s controls actually help professionals locate and repair problems in generated pictograms.
- A documented account of the design decisions and their rationales, as a methodological contribution to research-through-design with generative AI.
Why Chile, why now
AAC research has been developed overwhelmingly in English-language settings. The dominant libraries carry the visual conventions and cultural assumptions of Europe and North America. This project generates design knowledge from Chilean professional practice, in Spanish, with a population whose needs remain largely absent from the international literature.
Chile presents specific conditions that sharpen the problem. A culturally diverse population8. A public discourse around autism that has historically infantilised autistic people, framing them as permanent children rather than as adults with lives, routines, and communicative agency of their own. Recent legislation, Ley 21.545 (2023), now recognises adults with autism as rights-holders across education, health, and social participation, but leaves the transition to independent living practically unaddressed as explicit policy. In this context, pictogram materials that are age-appropriate, locally situated, and produced with professional accountability are not a refinement. They are a matter of dignity.
A constructive turn
I have written before about the centaur moment: the brief window where AI extends human capability rather than replacing it. This doctoral project is an attempt to work inside that window with care. Not by refusing generative tools, and not by accepting their output uncritically, but by designing the conditions under which professional intelligence can guide, inspect, and correct what the model produces.
The contribution will not be a finished product. It will be a documented account of what happens when you treat generative AI as a material to be shaped rather than a service to be consumed. The aim, throughout, is to keep the professional as the author of every pictogram that enters someone’s communication system.
The code is on GitHub. The work continues.