Skip to content

Commit

Permalink
GPU video genration and overlay caching (#11)
Browse files Browse the repository at this point in the history
  • Loading branch information
neri14 authored Aug 26, 2024
1 parent 6d4a5e2 commit feb3e3f
Show file tree
Hide file tree
Showing 21 changed files with 1,060 additions and 433 deletions.
4 changes: 0 additions & 4 deletions NOTES.md

This file was deleted.

16 changes: 12 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,14 @@

[![Build and Test](https://github.com/neri14/videographer/actions/workflows/build-and-test.yml/badge.svg)](https://github.com/neri14/videographer/actions/workflows/build-and-test.yml)


## ToDo
- cleanup pipeline implementation after adding gpu
- add -g --gpu flag to control if generation goes to cpu or gpu
- add argument for controlling upscale (if then also control applied template)
- implement overlay caching onto a single buffer of static texts and if overlay doesnt change between frames


## Dependencies

Packages required in system to build the application
Expand All @@ -25,11 +33,11 @@ Packages required in system to build the application

## Pipeline

1. Record the video
1. Run concat tool to combine the video (TODO what about trimming?)
1. Get video and telemetry
1. Run concat tool to combine the clips
1. Run trim tool to get final base video
1. Run alignment tool and figure out offset
1. Run generator app to generate base video (most probably will be in 1080p)
1. Run upscale tool so YT doesn't kill quality
1. Run generator app to generate base video (upscaled to 4K for YT)
1. Upload


Expand Down
58 changes: 58 additions & 0 deletions gpu.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# GPU vs CPU measurements

## Tests
On 1minute 1080p file

### Base measurements
#### Original impl
46.866s / 45.314s

#### x264enc -> nvh264enc
30.006s


### After removing overlay
#### Original impl
34.834s
35.933s

#### x264enc -> nvh264enc
6.099s
6.008s


### Added upscale
Upscaling 1080p->4k

Bitrate 16->80mbit

#### Original impl
2m42.928s

#### x264enc -> nvh264enc
17.390s

#### Moved all video to gpu
15.469s
15.564s
15.484s

#### added crude overlay
54.255s (29.196s of measured drawing time (runs on CPU))

i.e.:
- about 10s is overhead from passing memory to GPU and GPU applying the overlay
- about 29s is overhead from crude CPU drawing without caching that can be improved

Note the overlay is now generated after scaling - i.e. CPU measurements were in 1080p genration, GPU is in 4K


## Test on raw GoPro video
- Video length: 39m30s
- Resolution: 1080p
- Full GPU impl
- no overlays

Time:
8m45.467s (4x speedup!)

37 changes: 37 additions & 0 deletions src/arguments.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,13 @@ namespace vgraph {
namespace key {
std::string help("help");
std::string debug("debug");
std::string gpu("gpu");
std::string telemetry("telemetry");
std::string input("input");
std::string output("output");
std::string timecode("timecode");
std::string resolution("resolution");
std::string bitrate("bitrate");
}

utils::argument_parser prepare_parser()
Expand All @@ -16,9 +20,16 @@ utils::argument_parser prepare_parser()

parser.add_argument(key::help, utils::argument().flag().option("-h").option("--help").description("Print this help message"));
parser.add_argument(key::debug, utils::argument().flag().option("-d").option("--debug").description("Enable debug logs"));
parser.add_argument(key::gpu, utils::argument().flag().option("-g").option("--gpu").description("Use Nvidia GPU for processing"));

parser.add_argument(key::telemetry, utils::argument().option("-t").option("--telemetry").description("Telemetry file path"));
parser.add_argument(key::input, utils::argument().option("-i").option("--input").description("Input video file path"));
parser.add_argument(key::output, utils::argument().option("-o").option("--output").description("Output video file path"));

parser.add_argument(key::timecode, utils::argument().flag().option("-c").option("--timecode").description("Draw a timecode on each frame"));

parser.add_argument(key::resolution, utils::argument().option("-r").option("--resolution").description("Output video resolution, format: WIDTHxHEIGHT"));
parser.add_argument(key::bitrate, utils::argument().option("-b").option("--bitrate").description("Output video bitrate, in kbit/s"));

return parser;
}
Expand All @@ -40,16 +51,42 @@ bool read_mandatory_value(const utils::argument_parser& parser, const std::strin
return true;
}

bool parse_resolution(const std::string& str, utils::logging::logger& log, std::pair<int, int>& out)
{
try {
int delimiter_pos = str.find('x');
out.first = std::stoi(str.substr(0, delimiter_pos));
out.second = std::stoi(str.substr(delimiter_pos+1, std::string::npos));
} catch (...) {
log.error("Error parsing \"{}\" as resolution, expected format is \"WIDTHxHEIGHT\"", str);
return false;
}
return true;
}

arguments read_args(const utils::argument_parser& parser, utils::logging::logger& log)
{
arguments a;
bool valid = true;

a.debug = parser.get<bool>(key::debug);
a.gpu = parser.get<bool>(key::gpu);

valid = read_mandatory_value<std::string>(parser, key::telemetry, log, a.telemetry) && valid;
valid = read_mandatory_value<std::string>(parser, key::input, log, a.input) && valid;
valid = read_mandatory_value<std::string>(parser, key::output, log, a.output) && valid;

a.timecode = parser.get<bool>(key::timecode);

std::string res_str;
if (read_mandatory_value<std::string>(parser, key::resolution, log, res_str)) {
valid = parse_resolution(res_str, log, a.resolution) && valid;
} else {
valid = false;
}

valid = read_mandatory_value<int>(parser, key::bitrate, log, a.bitrate) && valid;

if (!valid) {
exit(1);
}
Expand Down
7 changes: 7 additions & 0 deletions src/arguments.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,17 @@ namespace vgraph {

struct arguments {
bool debug = false;
bool gpu = false;

std::string telemetry = "";
std::string input = "";
std::string output = "";

bool timecode = false;

std::pair<int, int> resolution = {3840,2160};
int bitrate = {80*1024};

static arguments parse(int argc, char* argv[]);
};

Expand Down
24 changes: 22 additions & 2 deletions src/manager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
#include "video/overlay/overlay.h"

#include <iostream>
#include <chrono>

namespace vgraph {

Expand All @@ -20,10 +21,29 @@ void manager::init(int argc, char* argv[])

void manager::run()
{
video::overlay::overlay overlay;
video::generator gen(args.input, args.output, overlay);
log.info("Generation will use {}", args.gpu ? "GPU" : "CPU");
if (!args.gpu) {
log.warning("!! FOR BETTER PERFORMANCE CONSIDER GENERATING VIDEO ON GPU !! (-g/--gpu flag)");
}

auto t1 = std::chrono::high_resolution_clock::now();

video::overlay::overlay overlay(args.resolution, args.timecode);
overlay.precache();

auto t2 = std::chrono::high_resolution_clock::now();

video::generator gen(args.input, args.output, overlay, args.gpu, args.resolution, args.bitrate);
gen.generate();

auto t3 = std::chrono::high_resolution_clock::now();

log.info("Overlay pre-setup time: {:.3f} s",
std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1).count()/1000.0);
log.info("Video generation time: {:.3f} s",
std::chrono::duration_cast<std::chrono::milliseconds>(t3 - t2).count()/1000.0);
log.info("Total time: {:.3f} s",
std::chrono::duration_cast<std::chrono::milliseconds>(t3 - t1).count()/1000.0);
}

void manager::enable_logging()
Expand Down
26 changes: 26 additions & 0 deletions src/utils/argument_parser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,32 @@ std::string argument_parser::get(const std::string& key) const
return val[0];
}

template <>
int argument_parser::get(const std::string& key) const
{
if (!arguments_.contains(key))
throw argument_exception(std::format("Undefined argument \"{}\" retrieval attempt", key));

if (!has(key))
throw argument_exception(std::format("Retrieval of not provided argument \"{}\" value", key));

const auto& val = values_.at(key);
if (val.size() < 1)
throw argument_exception(std::format("Retrieval of argument \"{}\" value that has no associated value", key));
if (val.size() > 1)
throw argument_exception(std::format("Retrieval of singular argument \"{}\" value that has more values", key));

int ret = 0;
try {
ret = std::stoi(val[0]);
} catch(std::invalid_argument) {
throw argument_exception(std::format("Error parsing value \"{}\" of argument \"{}\" as int", val[0], key));
} catch(std::out_of_range) {
throw argument_exception(std::format("Value \"{}\" of argument \"{}\" is out of range", val[0], key));
}
return ret;
}//FIXME to be refactored after merging with improved arguments parser

template <>
std::vector<std::string> argument_parser::get(const std::string& key) const
{
Expand Down
2 changes: 0 additions & 2 deletions src/video/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,8 @@ cmake_minimum_required(VERSION 3.20.0)
target_sources(vgraph_lib
PRIVATE
generator.cpp
pipeline.cpp
PUBLIC
generator.h
pipeline.h
)

add_subdirectory(overlay)
Loading

0 comments on commit feb3e3f

Please sign in to comment.