Gaze redirecton on pre-recorded videos

I am recording video on my phone, and I am using the Elegant Teleprompter app display my script on the phone and display it on screen as I record. It is pretty good but I would like to improve the result by fixing the eye contact with the lens… It is pretty close already but could be a lot better if corrected in post (or in realtime but post is fine).

I’ve done some research and have found nvidia maxine, and nvidia ar SDK and nvidia AR… it’s all little overwhelming as to what it what… my end result really is to be able to do run a command, something like this:

gaze-redirector --input my-source-video.mp4 --output my-output-video.mp4

I have found this: GitHub - mpatacchiola/deepgaze: Computer Vision library for human-computer interaction. It implements Head Pose and Gaze Direction Estimation Using Convolutional Neural Networks, Skin Detection through Backprojection, Motion Detection and Tracking, Saliency Map.

And this page: https://developer.nvidia.com/blog/improve-human-connection-in-video-conferences-with-nvidia-maxine-eye-contact/#sdk_for_developers says " NVIDIA Maxine Eye Contact is available for download in the AR SDK for both Windows and Linux."

I was hoping to find something pre-packaged in the AUR but no luck.

I’m hoping someone can help clarify what the most direct path to achieving this is and help me avoid getting lost in the weeds.

There it states

So you will need to Apply and go through the process before you can access Nvidia Maxine.

OK… so that’s not the path I want then. thanks for pointing that out.

Yet this fellow has a tutorial on how to achieve what I want to do on windows: https://www.youtube.com/watch?v=NcUdTihgA6g

He refers to these two resources:

and
Maxine Windows AR SDK | NVIDIA NGC but this page is labelled “windows AR SDK”. (you need an nvidia account to access this page)

So… that reference to “this program is limited” I don’t understand when this SKD is available for download and people are using it without mention of early access… I can download that zip but I don’t because it is literally labelled as “windows”. I have my dates that it would work in a windows virtual box guest… maybe?

I did find this: Video Effects SDK System Guide - NVIDIA Docs but I’m not sure if it is worth my time to work through that and potentially just hit a brick wall.

There are some services like descript that offer this… but I’m hoping someone has experience doing gaze correction on linux so I can avoid paying a monthly fee for a service that is overkill and I will mostly not use.

I’m also looking into apps that do it on the fly on my android phone.

I’m trying to avoid asking an X/Y problem as much as I can. My end result is that I want to record some talking head scripted tutorials for a course.
By using elegant teleprompter on my phone screen, and using the selfie camera I get a pretty good result but fixing the eye gaze would make a big improvement I expect. I’m concerned about going down rabbit holes of compiling SDKs and such when the solution might be "install this app on you phone’ or “such and such a service does it for free”.

I have been researching around, and I stumbled on this: NVIDIA NIM | eyecontact I have uploaded a sample video to see how it goes.
I don’t see anything about credit limits etc… this might do the job without having to install, setup, build or anything.

– Update –

I have uploaded a 15min test video to NVIDIA NIM | eyecontact and it did the job. Time will tell how it goes doing much more video and if I need to redirect to another solution.

The ‘model card’ on that link states that the model supports Ubuntu and Debian… As I have this solution for now I’m putting aside my research on running that model locally – I’d prefer to run it locally though.