How to Create Viral AI Headshots for Social Media Apps with Astria.ai: A Tutorial

Astria.ai can be used to create high-quality, life-like headshots for your applications. Find out how to use Astria in this tutorial.

Key Insights

  • The advanced photographic AI will generate high-quality corporate photos of you in a bunch of different outfits and in different scenarios.

  • Astria specializes in creating production-grade images that serve a range of business use cases.

  • Creating a fine-tuned model is easy, and requires a minimum of four images and selection of a base pre-tuned model.

  • Astria allows users to customize their Stable Diffusion prompts and offers features like ControlNet, Multi-Person Generation, Multi-Pass Inference, and more.

Before we delve into the tutorial, let’s briefly discuss how prompting works with Stable Diffusion AI models.

Stable Diffusion: An Overview

Stable Diffusion AI is a pathbreaking Generative AI technology where text prompts, images or a set of images are used to create a series of final output images which are completely original (and not part of the original set of images used). Since its launch in 2022, it has gained phenomenal popularity due to its ability to generate unique photorealistic images, graphics, artwork or logos, and even features that help with image editing or retouching. The model has also been used to create short video clips, using features such as DeForum. In late 2023, the release of Stable Video Diffusion has given us the tools to create powerful photo animations from a single image.

The Technology

The best thing about Stable Diffusion models is that they can run even on low end hardware. This is due to its architecture, which comprises a variational autoencoder (VAE), forward and reverse diffusion, noise predictor and text conditioning. The VAE includes an encoder and decoder, and the encoder compresses a 512x512 image into a smaller 64x64 model in latent space, which makes it much easier to manipulate. The more recent model of SDXL 1.0, or the Stable Diffusion XL, is even faster and uses an ‘ensemble of experts’ approach to guide the generation process.

Using Stable Diffusion Models

The key capability of Stable Diffusion, or any diffusion-based image generation model, is their ability to create images from text prompts. By prompting these models just with a single line of text, one can generate realistic high-quality images. These models can also be fine-tuned with as little as four images, which guides the final outcome. This means that one can train these diffusion models to become experts at generating the kind of image an individual or business requires. This has opened up a world of creativity, where users have created fine-tuned models that generate images and videos as if they are from the movie ‘Barbie’, or a trailer for a fictitious film ‘Barbenheimer’, or stunning architecture and art.

So, once the model is trained on these images, the possibilities of image-generation become extensive. For example, you can place your model in any environment and play with the lighting and effects — say, a photograph of your model in the mountains, at dusk, low lighting, shot with a 35 mm lens on Kodak film.

Astria.ai has partnered with MyHeritage to create a feature named AI Time Machine, which can transform a user’s images to depict how they would look if they were transported back to a time when they were living with the tribe from which they originated.

Astria.ai — An Introduction

Astria is a platform that provides you with out-of-the-box user-friendly tools to fine-tune your Stable Diffusion models. The platform comes with pre-built features and APIs like AI Photoshoot, Product Shots, Inpainting and Masking, and a Fine-Tuning guide that makes the process of AI image generation easy and efficient.

One of the most powerful features of Astria.ai is that you can control most of the platform through an easy-to-use API. This essentially means that any app developer can build their platform on top of Astria APIs, and offer the advanced capabilities that Stable Diffusion models give their users in a very rapid timeframe.

For social media app developers, for instance, this is a massive opportunity. You could create a fine-tuned model inspired by, say, the Batman universe. You could then deeply integrate the Astria API into your mobile app codebase; then your users could use the images to showcase themselves to their friends. You could even align it to upcoming movie releases, and use that as a monetization model. The possibilities are endless.

Astria.ai for Advertisers, Production Houses, and App Makers

What if advertisers, marketers, app makers and production houses could harness the capabilities of Stable Diffusion via Astria easily?

Stable Diffusion has mostly been used by early adopters, creators, influencers and creative technologists till date. However, if businesses could harness its capabilities, they could benefit in a number of ways:

Incredibly Creative Visuals: Stable Diffusion models are able to generate visuals that are creative, original and unique. This can help advertisers create powerful campaigns that are visually stimulating, without much production effort.

Cost-Effective: Even simple photoshoots or headshots for advertisements currently require cameras and crew and, additionally, the effort to edit images. With text-to-image AI, this can be achieved through right fine-tuning and prompting. This takes minutes instead of days on a platform like Astria.ai, thereby reducing cost to a fraction of what it used to be.

Hyperpersonalization: Another potent capability is the ability to hyperpersonalize visuals to a specific individual. For instance, you could generate visuals that insert the customer or the user into the image, and thereby give them visuals that are designed specifically for them.

Technical Challenges of Fine-Tuning and Prompting

While Stable Diffusion and SDXL 1.0 are pretty powerful technologies, you require sufficient knowledge of ‘prompting’ to control the final outcome images.

Prompting Stable Diffusion is as much of a science as it is an art. Also, since the core open-source Stable Diffusion models are only accessible programmatically (there is no interface), non-programmers lack the UI needed to prompt and fine-tune a model properly.

Check this prompt below, for instance:

a masculine man. Split lighting Windows Swirly bokeh ball. Palace of Versailles Indoors Hall of Mirrors Gilded architecture Framing Surrounded by ornate furnishings Controlled lighting shot on Nikon Z6 Nikon Z 85mm f/1.8 S F/2.8, 1/125s, ISO 100, in style of Petter Hegre - tiled_upscale
BREAK
BREAK ohwx man wearing an expensive fine suit <lora:749183:1>
num_images=4
negative_prompt=anime, cartoon, graphic, text, painting, crayon, graphite, abstract, glitch, deformed, mutated, ugly, long neck, disfigured, fused lips, (navel, exposed midriff)
seed=
steps=20
cfg_scale=3.0
controlnet=
input_image_url=
mask_image_url=
denoising_strength=
controlnet_conditioning_scale=
controlnet_txt2img=false
super_resolution=true
inpaint_faces=true
face_correct=false
film_grain=false
face_swap=true
hires_fix=false
prompt_expansion=false
ar=1:1
scheduler=dpm++sde_karras
color_grading=
use_lpw=true
w=768
h=1024

This prompt transports the user to the Palace of Versailles, and even specifies the camera type, lens type, ISO and the background.

This know-how, which has been mostly accessible to indie developers who have dug into this deep learning AI technology, is going to become increasingly accessible to everyone thanks to platforms like Astria.ai.

Let’s now understand how Astria.ai can assist with production-grade text-to-image generation.

How Astria.ai Can Help with Production-Grade Text-to-Image Generation

I have used a number of text-to-image AI platforms, but what I’ve noticed is that many of them do not allow fine-grained control of faces, backgrounds and other features of the image. Without this ability to control, it is challenging especially for businesses to use image synthesis for actual production use cases. Astria.ai is markedly different. Here, faces do not get distorted, and you get better control of the final outcome image.

Another useful thing about Astria.ai is that it allows for a ‘Multi-Person’ feature. And the images this feature generates are precise and realistic.

Astria’s UI takes a little time to get used to — but the effort is worth the results.

Headshot Generation for Social Media with Astria.ai: A Tutorial

Let’s deep dive into the AI Photoshoot feature on Astria.ai. We’ll also take a look at the API, so that if you are an app developer, you can integrate it into your app and allow your users to use it.

Specifically, we’ll fine-tune our Stable Diffusion model with 20 images of a model we obtained from an open-access and royalty free collection (in my case, Pixabay). Then we’ll generate professional quality headshots to be used on platforms like Instagram, LinkedIn, Facebook, etc. After that, we’ll play around with various photography components like background, lighting, shadows, and grain to refine our images.

Fine-Tuning the Workflow

Log on to Astria.ai and click on Tunes. You will see something like this:

Then click on New Finetune. Give it a title — We’ll use ‘Pixabay Model 1’. Give it a class name; since our model is female, we’ll select ‘woman’ as the class name. Then upload 20 images of the model, and create the tune. This is how our training set of images looks like:

It will take anywhere from 15–20 mins for the fine-tuned model to get created. Once the fine-tune is complete you can navigate to it, in this case — ‘Pixabay Model 1’ — woman, and begin prompting. Remember to use the words ‘ohwx woman’ in your prompt to trigger the model to produce images. The ‘woman’ is the class name that you had used before.

Focusing on the smaller details of the prompt will generate better results. Unlike prompting in ChatGPT or other LLMs, the prompts here are focussed on describing the essential features, the background, the kind of camera, lighting, lens, angle, and more. You can use one or more parentheses to emphasize words (eg, ‘(colorful)’, ‘((cinematic))’), and one or more square brackets de-emphasize words (eg: [hyperrealistic]). Play around, and you get better at it.

You can also provide a ‘negative’ prompt of what the image should not contain. In this case, we have provided a negative prompt to ensure that the image generated is safe for work, not too high contrast, and not airbrushed.

Astria.ai provides a UI for users to play around with the prompt, and control the final outcome.

Let’s try a prompt.

Prompt 1: analog style modelshoot style ohwx woman as (wide shot)+++ of trendy and chic instagrammer in a foodie’s paradise, surrounded by an array of colorful desserts, masterpiece, cinematic light, ultrarealistic+, photorealistic+, 8k, raw photo, realistic, hyperrealistic, highest quality, best quality,highly detailed, masterpiece, best quality, extremely detailed cg unity 8k wallpaper, masterpiece, best quality, ultra-detailed, best shadow, detailed background, beautiful detailed face, beautiful detailed eyes best illumination, detailed face, beautiful, dulux, caustic, dynamic angle, beautiful detailed glow. dramatic lighting. highly detailed, insanely detailed hair, symmetrical, intricate details, professionally retouched, elegant, 8k high definition. strong bokeh. award winning photo.

Negative Prompt: NSFW, nude, underwear, muscular, elongated body, high contrast, airbrushed, blurry, (portrait), (a close up), (close up), (closeup), (close-up)

Prompt 2: 8k Linkedin professional photo of ohwx woman in a suit with studio lighting, blurred background, bokeh, corporate portrait headshot photograph best corporate photo winner, meticulous detail, hyperrealistic, centered uncropped symmetrical beautiful.

Negative Prompt: old, wrinkles, mole, blemish, scar, cg, 3d

Prompt 3: ohwx woman on the beach, with a great view of the beach, where you can see its beauty, the photograph should look realistic, realistic beach, add grain, (detail skin texture, ultra-detailed body), atmospheric scene, masterpiece, best quality, (cinematic light),outdoors

Negative Prompt: painting, cartoon, old, wrinkles, mole, blemish, scar, cg, 3d

Prompt 4: RAW photo, portrait of a ohwx woman wearing a red shirt (high detailed skin:1.2), 8k uhd, dslr, soft lighting, high quality, film grain, Fujifilm XT3

Negative Prompt: BadDream, drawing, painting, digital art, helmet, nude, nsfw, large breasts

Prompt 5: portrait of ohwx woman chef, model photoshoot, professional photo, kitchen in background, Amazing Details, Best Quality, Masterpiece, dramatic lighting highly detailed, analog photo, overglaze, 80mm Sigma f/1.4 or any ZEISS lens

Negative Prompt: mutated hands, poorly drawn hands, disfigured, malformed limbs

As you can see, we have used a model fine-tuned on a series of images of the same individual, and then used it to generate a range of headshots that the individual can use in a number of different places — from personal to professional settings.

This is just the beginning. The more you play around with the prompts, the more interesting the images get.

API for Fine-Tuning

You can also use Astria.ai APIs to create a fine-tune. This is a useful feature for developers who are building user-centric apps — they can let users upload a few of their headshots and use that to generate a fine-tuned model specific to that user.

Let’s take a look at the fine-tuning API.

// NodeJS 16
// With image_urls and fetch()
import fetch from "node-fetch";
const API_KEY = 'sd_XXXXXX';
const DOMAIN = 'https://api.astria.ai';
function createTune() {
 let options = {
 method: 'POST',
 headers: { 'Authorization': 'Bearer ' + API_KEY, 'Content-Type': 'application/json' },
 body: JSON.stringify({
 tune: {
 "title": 'Pixabay Woman 1 - UUID - 1234–6789–1234–56789',
 // Hard coded tune id of Realistic Vision v5.1 from the gallery - https://www.astria.ai/gallery/tunes
 // https://www.astria.ai/gallery/tunes/690204/prompts
 "base_tune_id": 690204,
 "name": "woman",
 "branch": "fast",
 "image_urls": [
 "imgurl1",
 "imgurl2",
 "imgurl3",
 "imgurl4"
 ],
 }
 })
 };
 return fetch(DOMAIN + '/tunes', options)
 .then(r => r.json())
 .then(r => console.log(r))
}
createTune()

You can find your API key in the Account section.

Once the tune is created you can send prompts to it in the following manner

const fetch = require(`node-fetch');
const FormData = require('form-data');

const API_URL = 'https://api.astria.ai/tunes/1/prompts';
const API_KEY = 'YOUR_API_KEY'; // Replace with your actual API key
const headers = { Authorization: `Bearer ${API_KEY}` }

const form = new FormData();
form.append('prompt[text]', 'portrait of ohwx woman chef, model photoshoot, professional photo, kitchen in background, Amazing Details, Best Quality, Masterpiece, dramatic lighting highly detailed, analog photo, overglaze, 80mm Sigma f/1.4 or any ZEISS lens');
form.append('prompt[negative_prompt]', 'mutated hands, poorly drawn hands, disfigured, malformed limbs');
form.append('prompt[super_resolution]', true);
form.append('prompt[face_correct]', true);
form.append('prompt[callback]', 'https://optional-callback-url.com/to-your-service-when-ready?prompt_id=1');

fetch(API_URL, {
  method: 'POST',
  headers: headers,
  body: form
}).then(response => response.json())

Read more about the API here.

Final Words

Astria.ai is a useful tool for businesses looking to generate high-quality photographs for their products, social media handles, and other use cases. It can also be a valuable tool for creators and influencers looking to expand their digital footprint with minimum effort. With Astria, I was able to control the quality of the output and the results were impeccable and production-grade, with very few rough edges. I recommend you give it a go and see the results for yourself.